Closed Barney241 closed 3 months ago
I think such an infinite loop would lead to such results and won't continue running endlessly due to some eventual state build-up.
Though I am no expert in performance details.
@Anush008 this is just simplified code to reproduce the issue. i modified is so its scoped which in theory should replicate api behavior where you have request process it and another on comes in.
Got it. Maybe @decahedron1 could have suggestions.
Closing since stale.
Hi i am experiencing high memory usages which caused my pod to be killed because of exeeding its limits. after some experiments i found its related to text length that i am trying to embed.
I know this is probably not problem of fastembed-rs but probably of ort or tokenizer libraries, but i cant precisely pinpoint the problem part.
when using example provided below the program starts with about 1.5GB of ram usage which is expected for this model, but after some iterations this process uses 13GB of virtual memory and 6GB of ram. which is a lot
I tried other smaller models as well and its happening for them too but in smaller scale. So my intuiton was that i am just using texts that are too long so i cut all to max 100 chars. which slowed the ram usage increase, but still if not using constant batch of text it kept growing untill the program was eventually killed.
For more context i am building vector api that returns embeddings for texts. so model is loaded from the start.