Segmentation fault when calling embeddings.index

neuml / txtai

💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows

https://neuml.github.io/txtai

Apache License 2.0

7.59k stars 531 forks source link

Segmentation fault when calling embeddings.index #744

Open mehabakshii opened 2 weeks ago

mehabakshii commented 2 weeks ago

Hi there! I'm on an M1 mac and am trying to get up and running with txtai, starting with a simple script:

import txtai

embeddings = txtai.Embeddings()
embeddings.index(["Test", "Text"])
print(embeddings.search("positive", 1))

It seems like the code is getting to a seg fault on the index step - failing w/ segmentation fault python3 script.py. Any idea what is going on here? I took a look at the common issues here: https://neuml.github.io/txtai/faq/#common-issues and followed steps under "Segmentation faults and similar errors on macOS" but exporting those global vars didn't work.

davidmezzetti commented 2 weeks ago

Hello, thank you for writing in.

This is a nagging issue for macs that is usually solved by the FAQ solution you mentioned.

If you wouldn't mind, what happens if you change your code to this

import faiss
faiss.omp_set_num_threads(1)

import txtai

embeddings = txtai.Embeddings()
embeddings.index(["Test", "Text"])
print(embeddings.search("positive", 1))

Do any of the options in this work? https://github.com/facebookresearch/faiss/wiki/Threads-and-asynchronous-calls

mehabakshii commented 1 week ago

Thanks @davidmezzetti for the suggestion! it unfortunately didn't work.

(env) ➜  bond-concept python3 script.py
OMP: Error #15: Initializing libomp.dylib, but found libomp.dylib already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://openmp.llvm.org/
[1]    96204 abort      python3 script.py

The suggested workaround also did not work; and still resulted in a segmentation fault. Are there other workarounds you could suggest?

davidmezzetti commented 1 week ago

Setting that parameter has always worked. The error message seems like the same one as always that's solved with that ENV parameter.

How about if you run this code?

import faiss
import numpy as np

data = np.random.rand(100, 100)

index = faiss.index_factory(data.shape[1], "IDMap,Flat")
index.add_with_ids(data, np.arange(data.shape[0], dtype=np.int64))
print("Count =", index.ntotal)

You can also try a different backend such as embeddings = txtai.Embeddings(backend="hnsw") which requires pip install hnswlib

There are a number of upstream open issues related to this: https://github.com/kyamagu/faiss-wheels/issues

mehabakshii commented 1 week ago

Hmm I'm getting a seg fault when running index = faiss.index_factory(data.shape[1], "IDMap,Flat"), and same thing when ysing the hnsw backend. Do you recommend posting an issue on the faiss-wheels repo instead as it seems like this isn't specific to txtai?

davidmezzetti commented 1 week ago

If you're able to isolate that code with a simple install of faiss-cpu in a new virtual environment, then yes. But it looks like there are a couple issues already that have been there a while.

I'm not sure why the known workaround that's worked in all previous cases doesn't work for your setup. It's the same issue that has been reported many times in the past.

mehabakshii commented 6 days ago

Also tried uninstalling faiss-cpu via pip uninstall faiss-cpu and reinstalled using conda install -c pytorch faiss-cpu=1.8.0 which is shown to be the supported way to install, and still getting the seg fault sadly :(

Is there anything I can run (and please feel free to point me to the right repo/forum to ask if this isn't it!) on my machine to print details about my environment/installation etc that would be helpful?

davidmezzetti commented 5 days ago

If you're able to reproduce this with just the code I provided with the conda install, I'd report it to the faiss project.