Open mihainiculai opened 1 month ago
Hey @mihainiculai , thanks for submitting this bug. I will check
From my use of LLM Guard's Anonymize
scanner, I can confirm the above behavior. For longer inputs, crazy amounts of memory are allocated. I checked the source code, but there seems to be no way to influence this behavior with a param to the Anonymize constructor.
I wondered though, if in the model config in ner_mapping.py
, the dict key chunk_size
is respected when running the models. I would expect a fixe chunk size of 600, as it is set in the config, to not lead to such extensive memory usage. But maybe this assumption is wrong and it's still about the original input's size.
@asofter Can we assist in any way?
I am using only the anonymizer scanner with the LLM Guard API, and I noticed significant memory (RAM) increase when processing larger inputs, with the memory usage never decreasing afterward. For example, when processing small inputs, memory consumption is around 2GB. However, when I pass inputs of around 4-5k characters, the memory usage increases to 4-5GB and stays at that level even after the processing is complete. If I input something excessive, like 15k characters, memory usage spikes to 240GB (likely starting to write to disk at that point).
I experience this behavior with all default settings, except for removing some scanners from the
scanners.yaml
file. Is this expected behavior, or is there an issue with memory management when using the anonymizer scanner?