修改max_seq_length好像并没有生效？

ssbuild / chatglm_finetuning

chatglm 6b finetuning and alpaca finetuning

1.53k stars 175 forks source link

Closed tjulh closed 1 year ago

tjulh commented 1 year ago

我在用lora微调时，为了减小输入长度达到降低显存占用的目的，在数据预处理阶段把sft_config_lora.py文件里的max_seq_length调小了，但是看起来并没有生效，是还要改其他什么地方吗？

ssbuild commented 1 year ago

修改了max_seql_length 或者词典等信息，需要删除output 下面的 record缓存文件

tjulh commented 1 year ago

修改了max_seql_length 或者词典等信息，需要删除output 下面的 record缓存文件

我刚又试了下，确定有删除output下的所有文件，max_seq_len前后分别设置为1024和256时，显存占用情况看起来没啥变化...而且trainable params也都是一致的...

我的整个训练流程是这样的，先修改sft_config_lora的max_seq_len,再执行python data_util.py，最后执行python train.py，这个过程有啥问题吗

tjulh commented 1 year ago

把max_seq_len降到128，可以看到显存占用的减少了，很奇怪降到256为什么没有变化....

ssbuild commented 1 year ago

检查下数据的实际长度。