issues
search
huggingface
/
alignment-handbook
Robust recipes to align language models with human and AI preferences
https://huggingface.co/HuggingFaceH4
Apache License 2.0
4.2k
stars
357
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
wierd conversation with zephyr-7b-dpo-lora
#78
njupopsicle
opened
6 months ago
2
DPO fine-tuning errors out on Yi 34B (Assertion `srcIndex < srcSelectDimSize` failed)
#77
cvetanovskaa
opened
6 months ago
0
can we inference with lora adapter after running the SFT ?
#76
Tejaswi-kashyap-006
closed
6 months ago
2
Add instrutions to evaluate on academic datasets
#75
Randl
opened
7 months ago
0
A question about the SFTTrainer (also a theoretical question about SFT in general)
#74
PradeepKadubandi
opened
7 months ago
3
Training on LORA using multi-gpu is giving constant loss
#73
sids07
opened
7 months ago
5
SFT lora ends with higher loss
#72
Randl
opened
7 months ago
1
Add warmup to config
#71
Randl
closed
6 months ago
0
Chat template is not loaded when evaluating on MT-bench
#70
ChenDRAG
closed
7 months ago
1
help to do SFT usning multi-machine, for example 8 nodes (1 A100 for 1 node)
#69
Atlantic8
opened
7 months ago
1
DPO alignment doesn't work on Lora models as suggested
#68
Abe13
opened
7 months ago
1
Role of `prompt` field in SFT
#67
shabie
opened
7 months ago
0
How to specify another GPU to run rather than cuda:0?
#66
njupopsicle
closed
7 months ago
1
Warning about max sequence length
#65
ChenDRAG
opened
7 months ago
0
Update doc CI
#64
lewtun
closed
7 months ago
0
[process exited with code 1 (0x00000001)]
#63
patchie
opened
7 months ago
1
Integrate feedback from the community
#62
lewtun
opened
7 months ago
0
SFT training doesn't fully go through all samples
#61
hanxiaotian
opened
7 months ago
3
Update docstring for `data.py` to reflect true behavior of `shuffle` parameter
#60
scottfleming
closed
7 months ago
3
Get this error on run_sft.py when calling "trainer.push_to_hub": [Rank 0] Watchdog caught collective operation timeout
#59
ohmeow
opened
7 months ago
7
High VRAM usage with ZeRO 3.
#58
nathan-az
closed
7 months ago
2
"RuntimeError: The size of tensor a (0) must match the size of tensor b (4096) at non-singleton dimension 1" (DPO + LoRA)
#57
ohmeow
opened
7 months ago
7
Why does the alignment-handbook account for user & system Inputs in loss calculation
#56
xffxff
opened
7 months ago
3
Running on single GPU(16GB)
#55
patchie
opened
7 months ago
1
Impossible to load local pretrained model for DPO Alignment
#54
mathis-lambert
closed
7 months ago
2
Allow loading datasets from disk using `load_from_disk` method.
#53
dmilcevski
closed
7 months ago
1
What about the system prompt?
#52
timothylimyl
opened
7 months ago
0
Add check that parameters are not intended to be offloaded
#51
nathan-az
closed
7 months ago
2
What is the expected "global batch size"?
#50
ohmeow
closed
7 months ago
1
Allow running DPO from a local model
#49
dmilcevski
closed
7 months ago
0
Training using a custom prompt format and custom dataset
#48
BakingBrains
closed
7 months ago
0
Tokenizer model_max_length
#47
binarycrayon
opened
7 months ago
6
Weird DPO loss
#46
ChenDRAG
opened
7 months ago
1
Reproducing of Lora Model Result on MT-Bench
#45
wlhgtc
opened
7 months ago
27
Global batch size question
#44
liutianlin0121
opened
7 months ago
7
Did you use RMSprop or AdamW as the optimizer?
#43
alvarobartt
closed
7 months ago
3
How to QLoRA training with ZeRO-3 on two or more GPUs?
#42
Di-Zayn
opened
7 months ago
4
Windows installation
#41
NicolasMejiaPetit
opened
7 months ago
0
How do I get the training scrips to utilize all my GPUs?
#40
ohmeow
closed
7 months ago
1
Why zephyr-7b-dpo-lora is finetuned from mistralai/Mistral-7B-v0.1 instead of zepher-7b-sft model?
#39
ChenDRAG
opened
7 months ago
2
DPO loss
#38
JiuhaiChen
opened
7 months ago
7
Question about "ModuleNotFoundError: No module named 'alignment'"
#37
liugangdao
opened
7 months ago
2
Training Finishes Prematurely after Max Length increases
#36
ujjawalmadan
opened
7 months ago
2
[README.md] Update installation instruction.
#35
Girrajjangid
closed
7 months ago
0
Misalignment between config_lora.yaml and the model card
#34
ChenDRAG
opened
7 months ago
0
Hardware used for reproducing
#33
nathan-az
closed
7 months ago
1
Max Sequence Length
#32
ujjawalmadan
opened
7 months ago
0
Missing config params on SFT
#31
tcapelle
closed
7 months ago
0
Fix `apply_chat_template` function for `dpo` and unknown `task`
#30
alvarobartt
closed
7 months ago
0
Release dSFT data preparation (self-instruct) code?
#29
nlpcat
opened
7 months ago
0
Previous
Next