I've chunked my document up where each chunk is no more than 500 words with a total of ~300 chunks. I am then looping over each chunk and using the default Embed4All() embedder. However, when it gets to about 30 chunks in, I get a python segmentation fault (which I understand is likely due to running out of memory). I can see my RAM usage increase throughout this process. It seems to me that the C/C++ running under the hood is holding onto the memory even after the embedding is complete? I have ran the exact same using OpenAI without error so I don't think the resultant embeddings are taking up the memory. I also know it's not a specific chunk as I've run the embeddings from different starting points.
I've tried using device="gpu" with no difference and I've also tried using a different embedder model.
System Info
Information
Reproduction
Description
I've chunked my document up where each chunk is no more than 500 words with a total of ~300 chunks. I am then looping over each chunk and using the default Embed4All() embedder. However, when it gets to about 30 chunks in, I get a python segmentation fault (which I understand is likely due to running out of memory). I can see my RAM usage increase throughout this process. It seems to me that the C/C++ running under the hood is holding onto the memory even after the embedding is complete? I have ran the exact same using OpenAI without error so I don't think the resultant embeddings are taking up the memory. I also know it's not a specific chunk as I've run the embeddings from different starting points.
I've tried using
device="gpu"
with no difference and I've also tried using a different embedder model.Snippet
Error
18392 segmentation fault
Apple error report
Expected behavior
The embedder can run many times without crashing in a single program.