QwenLM / Qwen2.5-Coder

Qwen2.5-Coder is the code version of Qwen2.5, the large language model series developed by Qwen team, Alibaba Cloud.
821 stars 74 forks source link

增量预训bos/eos如何加 #84

Closed tangbo-sh closed 2 months ago

tangbo-sh commented 3 months ago

希望在codeqwen上做领域数据的continue pretrain,数据格式该如何组织,[bos]content[eos]还是仅加eos即可?另外,adam优化器的epsilon参数建议如何设置,谢谢

cyente commented 2 months ago

FYI https://qwen.readthedocs.io/en/latest/training/SFT/llama_factory.html