finetune_chat.py为什么要限制MICRO_BATCH_SIZE和GRADIENT_ACCUMULATION_STEPS呀？

Facico / Chinese-Vicuna

Chinese-Vicuna: A Chinese Instruction-following LLaMA-based Model —— 一个中文低资源的llama+lora方案，结构参考alpaca

https://github.com/Facico/Chinese-Vicuna

Apache License 2.0

4.14k stars 422 forks source link

Closed grantchenhuarong closed 1 year ago

grantchenhuarong commented 1 year ago

请问一下为什么需要作如下限定？这样就不能换GPU调整参数玩儿了呢。

是因为如果不相同，就影响最终模型的生成效果么？

Facico commented 1 year ago

你也可以把它去掉，因为不是相同的batch size算出来的step不太一样，而学习率这些是和step相关的