Open nuts-kun opened 9 hours ago
Though I would be happy to share you with the curves, but as NAR, the edges of groundtruth and the predicted is not that matched, the train loss will down to 1 quickly and see nothing special after, or maybe a centre around 0.6 is pretty well
The most effective way to determine whether the training process goes normally is to inference a sample. e.g. hearing something intelligible at 200K updates (maybe longer for E2, a bit more patience is all you need)
If you found some way helpful to learn better training process, welcome to share (e.g. have a validation loss, we haven't try yet)
Thank you very much for the quick response! Comparing the shared loss curve with the one from my model, it seems that the training is progressing smoothly :)
Thank you again!
Oh, one thing might matters. If your multilingual training set includes both ZH and JA, I have no idea if there would be confusion of model facing same formed Hanzi or Kanji (as we just leverage characters and do not distinguish them from different languages) Just JA or just ZH is fine~
I took that into consideration when writing the preprocessing and dataset class, so I think it should be fine ;)
Hi, Thank you for sharing this great work! We have started to train this model on a multi-lingual dataset, including Japanese, and would be happy to compare it with the loss curve of the model trained for the paper to see how the learning progresses. Would it be possible for you to share the wandb report? Best regards.