Closed kyoto7250 closed 1 year ago
Hi! Thanks for the interest in our work! There could be some slight misunderstanding here:
Both directions are specified in the current parallel_ft script, but does this mean that both will be learned in one training? I read your paper, and I believe that learning is not done in both directions at once.
Actually, it is done in both (totally 10) directions at once! Please check the claim in the paper:
We train the model in a many-to-many multilingual translation manner, .....
So, you can reproduce the results by just running as it is!
Thank you for your quick reply! Okay, I will read a paper again.
Thank you for sharing your great work! I would like to know the small things of the training script.
Both directions are specified in the current
parallel_ft
script, but does this mean that both will be learned in one training? I read your paper, and I believe that learning is not done in both directions at once.https://github.com/fe1ixxu/ALMA/blob/a3cc7877752779346312bb07798172eadc83d692/runs/parallel_ft.sh#L2 https://github.com/fe1ixxu/ALMA/blob/a3cc7877752779346312bb07798172eadc83d692/runs/parallel_ft_lora.sh#L2
so should we reproduce the your result (en -> xx, xx -> en) by rewriting the following part?
direction:
en -> xx
direction:
xx -> en