UKPLab / EasyNMT

Easy to use, state-of-the-art Neural Machine Translation for 100+ languages
Apache License 2.0
1.16k stars 113 forks source link

gpu memory does not get released with `max_loaded_models` #92

Open Quesstor opened 1 year ago

Quesstor commented 1 year ago

Running the example code and watching watch -n .3 nvidia-smi you can see that the memory keeps increasing and is not released on the gpu.

Did i miss something here?

model = EasyNMT("opus-mt", max_loaded_models=1)

model.translate("Hallo, das ist ein Satz.", target_lang="en", source_lang="de")
model.translate("Hallo, das ist ein Satz.", target_lang="fr", source_lang="de")

time.sleep(3)
gc.collect()
torch.cuda.empty_cache()
time.sleep(3)

model.translate("Hallo, das ist ein Satz.", target_lang="nl", source_lang="de")
model.translate("Hallo, das ist ein Satz.", target_lang="it", source_lang="de")