Training with unfreezed RoBERTa - Githubissues

macabdul9 / CASA-Dialogue-Act-Classifier

PyTorch implementation of the paper "Dialogue Act Classification with Context-Aware Self-Attention" for dialogue act classification with a generic dataset class and PyTorch-Lightning trainer

MIT License

44 stars 13 forks source link

Training with unfreezed RoBERTa #8

Open macabdul9 opened 3 years ago

macabdul9 commented 3 years ago

Can someone train it with unfrozen RoBERTa and upload checkpoint?

glicerico commented 3 years ago

I can try, could you please specify exactly what knobs to adjust for your purpose?

macabdul9 commented 3 years ago

Comment line 15 and 16 in UtteranceRNN.py and re-train it.

PS. Now it will take more time since RoBERTa will also be trained

glicerico commented 3 years ago

Seems like I don't have enough RAM for this :(

macabdul9 commented 3 years ago

You were training it on CPU?

glicerico commented 3 years ago

No, on GPU... I should have said memory, not RAM :)

macabdul9 commented 3 years ago

In config.py batch size is 64 which will cause CUDA out of Memory, try with smaller batch size (8 should be fine or you can go with 4 as well) if possible.

glicerico commented 3 years ago

Started running now with batch = 16 :) Each epoch will take around 2 hours, I'll report in a couple days

glicerico commented 3 years ago

The checkpoint with unfrozen Roberta achieved 77.6% accuracy! Here's a link to download it. I will probably delete it in a few days, cause it takes a lot of storage in my Dropbox. Download soon if you want it. Also, if someone can host the checkpoint in some permanent storage, please share the link here :) https://www.dropbox.com/s/fkr9n4vgtkwj4j2/epoch%3D11-val_accuracy%3D0.776028.ckpt?dl=0