LOSS下降趋势呈阶梯状，有明显【断崖下降】痕迹，请问，有头绪么

Thunderltx commented 5 months ago

Describe the Question

Please provide a clear and concise description of what the question is.

批注 2024-01-19 184523

数据量不大，只是6K+的医疗对话这是参数 CUDA_VISIBLE_DEVICES=0 python supervised_finetuning.py \ --model_type chatglm \ --model_name_or_path ../THUDM/chatglm2-6b-32k \ --train_file_dir ./data/sft \ --validation_file_dir ./data/sft \ --per_device_train_batch_size 4 \ --per_device_eval_batch_size 4 \ --do_train \ --do_eval \ --use_peft True \ --max_train_samples 1000 \ --max_eval_samples 10 \ --num_train_epochs 30 \ --learning_rate 2e-5 \ --warmup_ratio 0.05 \ --weight_decay 0.05 \ --logging_strategy steps \ --logging_steps 10 \ --eval_steps 50 \ --evaluation_strategy steps \ --save_steps 500 \ --save_strategy steps \ --save_total_limit 13 \ --gradient_accumulation_steps 1 \ --preprocessing_num_workers 4 \ --output_dir ./saves/0119 \ --overwrite_output_dir \ --ddp_timeout 30000 \ --logging_first_step True \ --target_modules all \ --lora_rank 8 \ --lora_alpha 16 \ --lora_dropout 0.05 \ --torch_dtype float16 \ --fp16 \ --device_map auto \ --report_to tensorboard \ --ddp_find_unused_parameters False \ --gradient_checkpointing True \ --cache_dir ./cache

shibing624 commented 5 months ago

正常的。

Thunderltx commented 5 months ago

正常的。噫？正常的么(○´･д･)ﾉ？

Thunderltx commented 5 months ago

正常的。训练完了，进行了对话，修改自我认知失败，感觉不正常嘞。

shibing624 / MedicalGPT

LOSS下降趋势呈阶梯状，有明显【断崖下降】痕迹，请问，有头绪么 #313

Describe the Question