Open TanZhili opened 4 months ago
You only need to update the clean.py I believe. Did you observe overfitting during training? If not, you probably need to train for more steps.
You only need to update the clean.py I believe. Did you observe overfitting during training? If not, you probably need to train for more steps.
How many steps should be enough? Could you help to guess?
Ideally 100k, but I am not sure what will happen on a 200 hours dataset... We didn't explore pretraining with small dataset currently. I will suggest to set a lower learning rate, like 4e-5, 2e-5, enable weight decay, and enable dropout to counter overfitting.
@TanZhili Do let us know how many steps and what settings you set, if you successfully train your thai model.
@TanZhili Do you successfully train your thai model?
This issue is stale because it has been open for 30 days with no activity.
I am training the model for a new language, Thai.
Finetune logs: Epoch 0: | | 5267/? [16:46:54<00:00, 0.09it/s, v_num=2, train/loss=7.090, train/top_5_accuracy=0.209, val/loss=7.640, val/top_5_accuracy=0.172]
Lora logs: Epoch 0: | | 1242/? [16:41:25<00:00, 0.02it/s, v_num=5, train/loss=9.500, train/top_5_accuracy=0.154, val/loss=10.60, val/top_5_accuracy=0.115]