How to make the model load only once?

ikergarcia1996 / Easy-Translate

Easy-Translate is a script for translating large text files with a SINGLE COMMAND. Easy-Translate is designed to be as easy as possible for beginners and as seamlesscustomizable and as possible for advanced users.

Apache License 2.0

189 stars 306 forks source link

How to make the model load only once? #12

Open lattemj opened 12 months ago

lattemj commented 12 months ago

Can the model be loaded only once instead of waiting for the load to complete each time?

ikergarcia1996 commented 12 months ago

Hi @lattemj If you want to translate all the files in a directory, use the --sentences_dir flag instead of --sentences_path. You need to download the more recent version of the code, as I have implemented this argument today.

# We use --files_extension txt to translate only files with this extension. 
# Use empty string to translate all files in the directory

python3 translate.py \
--sentences_dir sample_text/ \
--output_path sample_text/translations \
--files_extension txt \
--source_lang en \
--target_lang es \
--model_name facebook/m2m100_1.2B

Is this what you are trying to do?

twicer-is-coder commented 6 months ago

Any update on this? He is asking to keep the model loaded in memory so for every inference the model does not have to be loaded again as it time consuming.

ikergarcia1996 commented 6 months ago

@twicer-is-coder the only solution is to either put all your data in a single or multiple files and do a single call to the code. If you want to run the code as an API, you can use libraries that have been built for that purpose, such as VLLM https://github.com/vllm-project/vllm or TGI https://huggingface.co/docs/text-generation-inference/index