Closed LeeSureman closed 4 years ago
@alantian Can you answer our colleague's questions regarding the joint training part?
@alantian And I notice I should run 'run_joint_prep.sh' first. But I don't know where is the en_wiki_text_lower.txt, europarl-v7.fr-en.en.tknzd.lower, fr_wiki_text_lower.txt ?
And I wonder why there is 'merge' in your code? From https://stackoverflow.com/questions/56315726/cannot-import-name-merge-from-keras-layers, I find that keras 2.+ doesn't support Merge.
@LeeSureman Indeed Merge can be removed. Wait a bit for @alantian to upload the files though. Thanks.
Did he finish uploading files? How can I get those files?
He will update the readme after uploading files, though he is a bit busy thesedays. Are you pressed by any deadlines?
Yes, now I drop the mono and multi word embedding loss in your paper. I just use the MT loss, but I can't find where the four following files are: --lang0_emb_file withctx.en-es.en.50.1.txt \ --lang1_emb_file withctx.en-es.es.50.1.txt \ --lang0_ctxemb_file withctx.en-es.en.50.1.txt.ctx \ --lang1_ctxemb_file withctx.en-es.es.50.1.txt.ctx \
we want to follow your work
Hey, @LeeSureman @muhaochen
I would like to let you know that necessary files has been upload to our Google Drive folder. Note that all gzipped files need to be decompressed after downloading --- for example, on a linux much this can be done by running gzip -d *.gz
.
Furthermore, it is expected to execute run_joint_prep.sh
before run_joint_train.sh
. Files you've mentioned (like en_mono
, fr_mono
, en_multi
, fr_multi
) are the products from the first step.
README.md
has also been updated accordingly to reflect these changes.
@alantian Thanks man!
NEED the code of expertiments of section 4.2 PLEASE 。 We need the code for training on monolingual dataset and test on cross-lingual dataset by aligning words. Please
Nice work, thank you for your contribution to NLP. However, I met some problems when I try to reproduce your work. I assume joint/scripts/run_joint_prep.sh is the script to train your model, and I wonder:
where can I get withctx files? I don't find them in https://github.com/swj0419/bilingual_dict_embeddings. I only get the withctx files for en and fr from https://github.com/swj0419/bilingual_dict_embeddings.
what is the en_mono, fr_mono, en_multi, fr_multi in run_joint_train.sh ? I cannot find files with the same name in https://drive.google.com/drive/u/0/folders/1Lm6Q5BxeU0ByR6DZcNfbWpntumiIKhYN