Open ollayf opened 1 year ago
Hi,
See here you may try passing a backend to your phonemize_word
function. When creating a new espeak phonemization instance, the code actually copies the espeak shared library somewhere in a temp directory (that's the 5mb). Normally the directory is deleted at exit or when garbage collected (see here this is a bit complex to deal with Linux/Mac/Windows).
Same problem leak memory on every import phonemizer
phonemes = phonemizer.phonemize({orig_text_wo_stress}, language="en")
don't instantiate one phonemizer backend per call https://github.com/bootphon/phonemizer/blob/d9f9ed266aa5cc2dd9e5eaea2c9571ab5024893c/phonemizer/phonemize.py#L206
Describe the bug There is a memory leak where each pass of the phonemize function for me takes up at least 5 mb. For some reason I have tried many things but to no avail. This is how I use it in my python code![Screenshot from 2023-07-03 00-28-53](https://github.com/bootphon/phonemizer/assets/62301945/7f59fb69-f03a-44c5-8208-63069b09f28f)
Phonemizer version![Screenshot from 2023-07-03 00-29-38](https://github.com/bootphon/phonemizer/assets/62301945/2a367ebe-5840-41ae-8144-f4f932c8dff3)
System Ubuntu 20.04 LTS Python 3.8
To reproduce![Screenshot from 2023-07-03 00-29-38](https://github.com/bootphon/phonemizer/assets/62301945/026caa0b-e69a-404e-8aff-a84fb9c8ea41)
Expected behavior Everytime the function ends, the memory should be collected in the garbage and released back to the OS. But every time it runs it permanently takes up 5 MB. This 5MB is seen from when i use
htop
and when i usepsutil.Process(os.getpid()).memory_info().rss / 1024 ** 2
in the python codeAdditional context