Closed sridhar-rv closed 1 year ago
Hard to say with these details. When you say basic example, you mean it's saving ~10 records?
Yes the example given in the documentation.
I would try running it as a Python script and see if that is any different. This isn't normal, perhaps it's something related to your environment. Do you have something like SELinux or another security policy that could block writes.
Alternatively, you can run the code on Colab or another cloud notebook system like Kaggle to rule out your environment.
Ok Sure I will try this.
What I see is the index is getting created....but the execution is not getting completed....
I tried executing the basic example as a python script.
from txtai.embeddings import Embeddings data = ["US tops 5 million confirmed virus cases", "Canada's last fully intact ice shelf has suddenly collapsed, forming a Manhattan-sized iceberg", "Beijing mobilises invasion craft along coast as Taiwan tensions escalate", "The National Park Service warns against sacrificing slower friends in a bear attack", "Maine man wins $1M from $25 lottery ticket", "Make huge profits without work, earn up to $100,000 a day"]
embeddings.index([(uid, text, None) for uid, text in enumerate(data)]) embeddings.save("index2")
Scenario 1: embeddings = Embeddings({"path": "sentence-transformers/nli-mpnet-base-v2"}) The script executed the documentation example fine and index got saved. Script completed successfully.
Scenario 2: embeddings = Embeddings({"path": "sentence-transformers/nli-mpnet-base-v2", "content": True, "objects": True}) The index got saved but script is not completing the execution.
So the content:True , objects:True is causing problem.
What OS are you running on? Objects shouldn't make a difference since there isn't binary content. There could be an issue with content and the version of SQLite available.
UBUNTU 20.04.4 LTS (FOCAL FOSSA) is the OS version. SQLite3 version : 3.38.5 Python 3.8.5 txtai 5.2.0
Not sure on this. Nothing seems out of the ordinary, lots of instances running on that platform without issues, including the automated GitHub Actions builds.
The latest version of Python 3.8 is 3.8.16. Did you apt-get update
and apt-get install
to update to the latest version of files for your OS? Best guess would be that something is blocking the write but it's hard to say.
Closing due to inactivity. Re-open or open a new issue if this still persists.
I have the following environment. Python 3.8 txtai Jupyter notebook index backend is default FAISS
The basic example given was tried. Everything works except the step where I want to save the index,
embeddings.save("index").
It runs for more than an hour and never completes. I have to kill kernel every time.
Is there a fix for this issue.
Thanks Sridhar