Closed zplovekq closed 4 years ago
Checkpoint 20 was obtained from this training run and checkpoint 24 from this one. Both were obtained from parallel training on two GTX 1080 GPUs.
sincerely thanks for your reply! I have tried these hyper-parameters as you said.I train on 3 RTX 2080Ti and I get BELU 33.9 with the score.sh in your repository. So if these is some difference in different GPU? Thanks!
BLEU-4 33.9 after CE loss training or after SCST training?
For the CE loss :D
I trained for about 100 epoch and the fairseq automaticly save the checkpoint best for the valid subset. I use the generate.py in your repository to generate test-prediction.json and use it for score.sh Thanks!
You additionally need to fine-tune with SCST to get BLEU-4 scores up to 39.
Hi. I use the SCST to train with the train command in your README.md And i have tried a few times, all the training loops are str I use the SCST to train with the train command in your README.md And i have tried a few times, all the training loops are struck in the 41%, like this: then the loop did not go through..... I didn't know why.Is there I do something wrong? Thanks!
Hi, my loss is alaways be -0 or +0, please tell me this is why?thank you!
I sincerely want to know the hyper-parameters for the checkpoint20 and checkpoint24. Thanks!