Open josaphjosta opened 3 years ago
Hi, did you already try using common crawl embedding instead of wikipedia?
The Japanese wikipedia embedding representasion is not really meaningful See : https://github.com/facebookresearch/fastText/issues/710
Also try decreasing the epoch size to 250k/500k. If all above doesn't work, please check this paper, in this paper they improve EN-JP alignment precision by 30%
Hope this works, please correct me if im wrong.
Using provided wiki.ja.vec and wiki.en.vec, so do the dictionaries. But the words precision seems strange:
INFO - 05/11/21 17:49:31 - 0:07:19 - 1451 source words - nn - Precision at k = 1: 0.000000 INFO - 05/11/21 17:49:31 - 0:07:19 - 1451 source words - nn - Precision at k = 5: 0.000000 INFO - 05/11/21 17:49:31 - 0:07:19 - 1451 source words - nn - Precision at k = 10: 0.137836
More info at train.log
Please help.