artitw / text2text

Text2Text: Crosslingual NLP/G toolkit
https://discord.gg/eHaaUuWpTc
Other
277 stars 33 forks source link

Fine-Tuning process #9

Open ghost opened 4 years ago

ghost commented 4 years ago

Hi! I would like to know the process of fine-tuning UniLM with inverted SQUAD (hardware, training time, number of steps, parameters, etc.) Would that be possible? Thanks in advance!

thusithaC commented 4 years ago

Yes, I have the same question. The repo is extremely useful and provides good quality results and easy to use and setup compared to some purely research based github repos.

However, this might be a naive question, but does this repo even include the code needed to train the .bin file. Would love to recreate this in other languages, so it would be extremely helpful if a re-training guide can be included in the readme, with links to the source datasets.

artitw commented 3 years ago

@ugmSorcero please see fine-tuning parameters below

--max_seq_length 512 \
--max_position_embeddings 512 \
--mask_prob 0.7 \
--max_pred 48 \
--train_batch_size 32 \
--gradient_accumulation_steps 2 \
--learning_rate 0.00002 \
--warmup_proportion 0.1 \
--label_smoothing 0.1 \
--num_train_epochs 10
artitw commented 3 years ago

@thusithaC I will have to play around and think about how to best incorporate training the model from scratch and get back on this. If you have any ideas about that, feel free to let us know.