是否可以增加联合训练的功能？

hiyouga / LLaMA-Factory

Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

https://arxiv.org/abs/2403.13372

Apache License 2.0

32.59k stars 3.99k forks source link

是否可以增加联合训练的功能？ #4921

Closed zhengjie-zhou closed 2 months ago

zhengjie-zhou commented 2 months ago

Reminder

[X] I have read the README and searched the existing issues.

System Info

Reproduction

Expected behavior

目前工程中集成了DPO、PPO、KTO、SFT等训练方式，是否可以新增对他们的组合功能，比如$L= \alpha L_{SFT} + \beta L_{DPO}$ ,其中$\alpha$和$\beta$属于超参数。

Others

No response

hiyouga commented 2 months ago

请见 pref_ftx 参数

zhengjie-zhou commented 2 months ago

请见 pref_ftx 参数

pref_ftx: float = field(
    default=0.0,
    metadata={"help": "The supervised fine-tuning loss coefficient in DPO training."},
)
那如果我想联合DPO和KTO进行训练，该如何调整？ @hiyouga