Closed TarunTater closed 3 years ago
I'm not able to reproduce this issue. I just reran inference colab notebook and tested on the latest version of fairseq to check if that's breaking something but everything seems to be working fine.
Can you help us by either providing the inputs that you think are causing the error in en-hi translation or a colab notebook reproducing this error?
I am not sure but why does the code needs 16 GB of GPU Memory to translate 18k sentences and it crashes due to GPU OOM. Not sure but it looks like it's super inefficient for some reason. NB i didn't do model.eval() explicitly because i have assumed that it's being already done.
It almost hits the GPU capacity even for 4k sentences as well. Plus, lot of translations results into just "\n" as the output as well.
Ref -> https://www.kaggle.com/adityaecdrid/translate-them-to-tamil-language-external-data. [If you will remove the sample 2**10 that I am passing while reading a csv, you should see the same)
Secondly, It would be nice if the results are written as they are computed, not in bulk as a one shot activity?
If your code crashed due to OOM, the output file will be empty. (look into the log file)
@AdityaSoni19031997 It looks like you are initializing the model multiple times (as you are using both command-line interface and the python interface to load the models)
# load the tranlation model from that directory
from indicTrans.inference.engine import Model # because of this import, we have to do cd...
en2indic_model = Model(expdir='/kaggle/working/en-indic')
en2indic_model
^ this is where you are using the python interface. With python interface, you can load the model onto GPU and do batch translation or paragraph translation (see the attached picture below or the colab notebook here )
Secondly, It would be nice if the results are written as they are computed, not in bulk as a one-shot activity?
The batch_translate
or paragraph_translate
method can help with this as you are translating one batch/paragraph at a time and storing the results.
! ./joint_translate.sh en_paragraphs.txt ta_paragraphs.txt "en" "ta" '../en-indic'
^ here, you are using the command-line interface, which loads the model again to the GPU, translates a text file in bulk and then offloads the model.
NB i didn't do model.eval() explicitly because i have assumed that it's being already done.
Yes, this is automatically handled in fairseq-interactive's prepare_model_for_inference method (this function calls make_generation_fast_
which sets model to eval mode).
In both our interfaces, we internally use fairseq-interactive (our command line interface directly calls fairseq interactive, and for the python interface we provide a wrapper around fairseq-interactive)
Even we we don't init the model twice, the GPU consumption is quite high, Feel free to fork and take a look! Thanks for the snips. Maybe will lower the batch-size and enable fp16 and see how it's.
Will debug a bit more and get back! Thanks for the pointers.
FYR @gowtham1997 , (the below is the stats when we are using joint_translate shell snip as-is)
Even we don't init the model twice, the GPU consumption is quite high, Feel free to fork and take a look! Thanks for the snips. Maybe will lower the batch size and enable fp16 and see how it's.
The model is a 434M parameter model (4 times the size of the base transformer model), so I think the high GPU consumption is expected if you are running it on 16 GB GPU and non-fp16 mode with batch sizes >=64.
Please tune both batch_size
and buffer_size in this line before running joint_translate to see if that helps
Do let us know if you find something else that is causing high GPU consumption that we missed to optimize.
@gowtham1997 - it was an issue with our terminal language setting, after a lot of debugging, we figured this simple command helped us : export PYTHONIOENCODING=UTF-8
since it was just a printing issue. Thanks for your help. I am not sure about the other discussion going on but i think thats not relevant to our issue. If its okay with you, i will close this issue ?
Sure. Thanks for the update. You can close this issue.
@AdityaSoni19031997 open a separate issue if you find something wrt GPU utilization that we are missing ( I still think the high gpu utilization is due to the model size and high batch size)
Hi, i am trying to follow
indictrans_fairseq_inference.ipynb
for inference using pretrained models for english to hindi. But the generated output file is empty. On running the command :bash joint_translate.sh en_sentences.txt hi_outputs.txt 'en' 'hi' '../en-indic'
, the following logs show up :However, when I see the hi_outputs.txt.log :
I get this error, and if I comment out the print lines on line 283 and line 285 in fairseq/fairseq_cli/interactive.py, it does not show any error but the output file still comes out empty.
logs if i comment out the print statements :
In this case, the
consolidated_testoutput
inpostprocess_translate.py
is :['', '', '', '']
I am unable to understand why the output is an empty file and how to use the model for inference