huggingface alignment-handbook issues

huggingface / alignment-handbook

Robust recipes to align language models with human and AI preferences

https://huggingface.co/HuggingFaceH4

Apache License 2.0

4.2k stars 357 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

wierd conversation with zephyr-7b-dpo-lora

#78 njupopsicle opened 6 months ago
2
DPO fine-tuning errors out on Yi 34B (Assertion `srcIndex < srcSelectDimSize` failed)

#77 cvetanovskaa opened 6 months ago
0
can we inference with lora adapter after running the SFT ?

#76 Tejaswi-kashyap-006 closed 6 months ago
2
Add instrutions to evaluate on academic datasets

#75 Randl opened 7 months ago
0
A question about the SFTTrainer (also a theoretical question about SFT in general)

#74 PradeepKadubandi opened 7 months ago
3
Training on LORA using multi-gpu is giving constant loss

#73 sids07 opened 7 months ago
5
SFT lora ends with higher loss

#72 Randl opened 7 months ago
1
Add warmup to config

#71 Randl closed 6 months ago
0
Chat template is not loaded when evaluating on MT-bench

#70 ChenDRAG closed 7 months ago
1
help to do SFT usning multi-machine, for example 8 nodes (1 A100 for 1 node)

#69 Atlantic8 opened 7 months ago
1
DPO alignment doesn't work on Lora models as suggested

#68 Abe13 opened 7 months ago
1
Role of `prompt` field in SFT

#67 shabie opened 7 months ago
0
How to specify another GPU to run rather than cuda:0?

#66 njupopsicle closed 7 months ago
1
Warning about max sequence length

#65 ChenDRAG opened 7 months ago
0
Update doc CI

#64 lewtun closed 7 months ago
0
[process exited with code 1 (0x00000001)]

#63 patchie opened 7 months ago
1
Integrate feedback from the community

#62 lewtun opened 7 months ago
0
SFT training doesn't fully go through all samples

#61 hanxiaotian opened 7 months ago
3
Update docstring for `data.py` to reflect true behavior of `shuffle` parameter

#60 scottfleming closed 7 months ago
3
Get this error on run_sft.py when calling "trainer.push_to_hub": [Rank 0] Watchdog caught collective operation timeout

#59 ohmeow opened 7 months ago
7
High VRAM usage with ZeRO 3.

#58 nathan-az closed 7 months ago
2
"RuntimeError: The size of tensor a (0) must match the size of tensor b (4096) at non-singleton dimension 1" (DPO + LoRA)

#57 ohmeow opened 7 months ago
7
Why does the alignment-handbook account for user & system Inputs in loss calculation

#56 xffxff opened 7 months ago
3
Running on single GPU(16GB)

#55 patchie opened 7 months ago
1
Impossible to load local pretrained model for DPO Alignment

#54 mathis-lambert closed 7 months ago
2
Allow loading datasets from disk using `load_from_disk` method.

#53 dmilcevski closed 7 months ago
1
What about the system prompt?

#52 timothylimyl opened 7 months ago
0
Add check that parameters are not intended to be offloaded

#51 nathan-az closed 7 months ago
2
What is the expected "global batch size"?

#50 ohmeow closed 7 months ago
1
Allow running DPO from a local model

#49 dmilcevski closed 7 months ago
0
Training using a custom prompt format and custom dataset

#48 BakingBrains closed 7 months ago
0
Tokenizer model_max_length

#47 binarycrayon opened 7 months ago
6
Weird DPO loss

#46 ChenDRAG opened 7 months ago
1
Reproducing of Lora Model Result on MT-Bench

#45 wlhgtc opened 7 months ago
27
Global batch size question

#44 liutianlin0121 opened 7 months ago
7
Did you use RMSprop or AdamW as the optimizer?

#43 alvarobartt closed 7 months ago
3
How to QLoRA training with ZeRO-3 on two or more GPUs?

#42 Di-Zayn opened 7 months ago
4
Windows installation

#41 NicolasMejiaPetit opened 7 months ago
0
How do I get the training scrips to utilize all my GPUs?

#40 ohmeow closed 7 months ago
1
Why zephyr-7b-dpo-lora is finetuned from mistralai/Mistral-7B-v0.1 instead of zepher-7b-sft model?

#39 ChenDRAG opened 7 months ago
2
DPO loss

#38 JiuhaiChen opened 7 months ago
7
Question about "ModuleNotFoundError: No module named 'alignment'"

#37 liugangdao opened 7 months ago
2
Training Finishes Prematurely after Max Length increases

#36 ujjawalmadan opened 7 months ago
2
[README.md] Update installation instruction.

#35 Girrajjangid closed 7 months ago
0
Misalignment between config_lora.yaml and the model card

#34 ChenDRAG opened 7 months ago
0
Hardware used for reproducing

#33 nathan-az closed 7 months ago
1
Max Sequence Length

#32 ujjawalmadan opened 7 months ago
0
Missing config params on SFT

#31 tcapelle closed 7 months ago
0
Fix `apply_chat_template` function for `dpo` and unknown `task`

#30 alvarobartt closed 7 months ago
0
Release dSFT data preparation (self-instruct) code?

#29 nlpcat opened 7 months ago
0

Previous Next