neulab / awesome-align

A neural word aligner based on multilingual BERT
https://arxiv.org/abs/2101.08231
BSD 3-Clause "New" or "Revised" License
325 stars 47 forks source link

Training details #57

Closed CBHhD closed 1 year ago

CBHhD commented 1 year ago

Hello, I'm interested in your work. While i'd like to know the training parameters for the deen dataset. It seems too big, so the max_step 40000 seems not use all the deen training data (it have about 1900000 data, equal to 237550 steps if the total batch size ==8 as the scripts show ) .So i'd like to know the extract parameters for the deen training such as batch size ,real training steps,thanks!

zdou0830 commented 1 year ago

Hi, I found that training the models for 40k steps is enough in most settings and training it longer can sometimes degrade the performance. I think you can start with 40k steps and see if the performance is good enough

CBHhD commented 1 year ago

Thank you very much !