Closed ZhishenYang closed 9 months ago
Hi Devaansh,
Thank you for open-sourcing the code. I am using the provided pre-trained model to replicate experimental results on the WIT dataset DE-ES.
The model outputs only language code IDs. Could you point out possible wrong implementations? Thank you.
Command used: python3 src/main.py \ --num_gpus 1 \ --mn wit_inference \ --ds wit \ --src_lang de \ --tgt_lang es \ --prefix_length 10 \ --bs 1 \ --test_ds test \ --stage translate \ --test \ --lm model_best_test.pth \
Source sentence: Joaquín Sabina (2007)
After tokenization: {'input_ids': tensor([[250003, 2177, 74688, 19, 35477, 76, 97666, 2]], device='cuda:0'), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1]], device='cuda:0'), 'labels': tensor([[250005, 2177, 74688, 19, 35477, 76, 22, 21, 8002, 399, 146, 78374, 8, 8884, 22, 25499, 2]], device='cuda:0'), 'mask_decoder_input_ids': tensor([[True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True]], device='cuda:0')}
Model outputs tensor([[ 2, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 2]], device='cuda:0')
Installed wrong transformer library.
Hi Devaansh,
Thank you for open-sourcing the code. I am using the provided pre-trained model to replicate experimental results on the WIT dataset DE-ES.
The model outputs only language code IDs. Could you point out possible wrong implementations? Thank you.
Command used: python3 src/main.py \ --num_gpus 1 \ --mn wit_inference \ --ds wit \ --src_lang de \ --tgt_lang es \ --prefix_length 10 \ --bs 1 \ --test_ds test \ --stage translate \ --test \ --lm model_best_test.pth \
Source sentence: Joaquín Sabina (2007)
After tokenization: {'input_ids': tensor([[250003, 2177, 74688, 19, 35477, 76, 97666, 2]], device='cuda:0'), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1]], device='cuda:0'), 'labels': tensor([[250005, 2177, 74688, 19, 35477, 76, 22, 21, 8002, 399, 146, 78374, 8, 8884, 22, 25499, 2]], device='cuda:0'), 'mask_decoder_input_ids': tensor([[True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True]], device='cuda:0')}
Model outputs tensor([[ 2, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 250005, 2]], device='cuda:0')