0nutation / DUB

Code and pretrained models for "DUB: Discrete Unit Back-translation for Speech Translation" (ACL 2023 Findings)
https://export.arxiv.org/abs/2305.11411
MIT License
26 stars 3 forks source link

hello I just want to mention a tiny thing or 2 #2

Open RzanRaed opened 8 months ago

RzanRaed commented 8 months ago

in back_translate.sh: sbash ${SCRIPTS_ROOT}/prepare_de_monolingual.sh the bash files are not in ${SCRIPTS_ROOT}, but in src.

also in back_translate.sh:

binary

for shard in $(seq -f "%2g" 0 ${shards_num});do the loops in bash are inclusive so if shards_num is 3 meaning (0,1,2) the loop will try to get shard 3 which doesn't exist so I replaced it with

binary

for shard in $(seq -f "%2g" 0 $((shards_num - 1)));do

wu-wen-zhou commented 8 months ago

Have you reproduced the code yet?

RzanRaed commented 8 months ago

Hello,

Yes I have without the bimodal BART. one more things that I want to mention. In the README, you mention using the below command for forward translation with pseudo units:  bash ${ROOT}/src/back_translate.sh --task translate --src_lang en_units --tgt_lang ${LANGUAGE} --bt_strategy ${BT_STRATEGY} It looks to me from the run_translate.sh file that I should use it after generating the synthetic data for U2TT training with pseudo data and not back_translate.sh. Am I missing out on something? (I have already done forward training with the synthetic data with run_translate.sh, I just want to double check)

wu-wen-zhou commented 8 months ago

problem I have encountered this problem and I would like to ask how to solve it