fe1ixxu / ALMA

State-of-the-art LLM-based translation models.
MIT License
440 stars 35 forks source link

About how to specify pairs in `Parallel_ft.sh`. #12

Closed kyoto7250 closed 1 year ago

kyoto7250 commented 1 year ago

Thank you for sharing your great work! I would like to know the small things of the training script.

Both directions are specified in the current parallel_ft script, but does this mean that both will be learned in one training? I read your paper, and I believe that learning is not done in both directions at once.

https://github.com/fe1ixxu/ALMA/blob/a3cc7877752779346312bb07798172eadc83d692/runs/parallel_ft.sh#L2 https://github.com/fe1ixxu/ALMA/blob/a3cc7877752779346312bb07798172eadc83d692/runs/parallel_ft_lora.sh#L2

so should we reproduce the your result (en -> xx, xx -> en) by rewriting the following part?

direction: en -> xx

 pairs=${2:-"en-de,en-cs,en-is,en-zh,en-ru"} 

direction: xx -> en

 pairs=${2:-"de-en,cs-en,is-en,zh-en,ru-en"} 
fe1ixxu commented 1 year ago

Hi! Thanks for the interest in our work! There could be some slight misunderstanding here:

Both directions are specified in the current parallel_ft script, but does this mean that both will be learned in one training? I read your paper, and I believe that learning is not done in both directions at once.

Actually, it is done in both (totally 10) directions at once! Please check the claim in the paper:

We train the model in a many-to-many multilingual translation manner, .....

So, you can reproduce the results by just running as it is!

kyoto7250 commented 1 year ago

Thank you for your quick reply! Okay, I will read a paper again.