Open andrewshvv opened 1 year ago
Has been able to reproduce it, for some reason it doesn't happen right after delete, but only after restart. For the sake of info, I am actually doing stop()
, when the program stops in my actual code.
client = chromadb.PersistentClient(path="test")
try:
client.delete_collection(name="test_collection")
except ValueError:
pass
collection = client.get_or_create_collection(
"test_collection",
metadata={"hnsw:space": "cosine"}
)
collection.add(
embeddings=[[1, 2, 3]],
ids=["1"]
)
collection.delete(ids=["3", "4", "5"])
client.stop() <=== improvised restart
client = chromadb.PersistentClient(path="test")
collection = client.get_or_create_collection(
"test_collection",
metadata={"hnsw:space": "cosine"}
)
print("peek")
collection.peek()["ids"]
While trying to replicate the bug I encountered another interesting behavior, let me know if I need to create an issue for it.
client = chromadb.PersistentClient(path="test")
try:
client.delete_collection(name="test_collection")
except ValueError:
pass
collection = client.get_or_create_collection(
"test_collection",
metadata={"hnsw:space": "cosine"}
)
collection.delete(ids=["3", "4", "5"])
Delete of nonexisting embedding ID: 3
--- Logging error ---
Traceback (most recent call last):
File "/Users/andrey/Library/Caches/pypoetry/virtualenvs/jobsearch-itTcmVTs-py3.9/lib/python3.9/site-packages/chromadb/db/mixins/embeddings_queue.py", line 263, in _notify_one
sub.callback([embedding])
File "/Users/andrey/Library/Caches/pypoetry/virtualenvs/jobsearch-itTcmVTs-py3.9/lib/python3.9/site-packages/chromadb/segment/impl/vector/local_persistent_hnsw.py", line 219, in _write_records
) is not None or self._brute_force_index.has_id(id)
AttributeError: 'NoneType' object has no attribute 'has_id'
Yeah can you please file another bug with just that minimal repro, thanks. Will patch
I had the same problem, after terminating the embedding generation while debugging. The "solution" was to delete the index once and the problem never appears again...
@ttww : How you performed the index deletion process?
What happened?
I have tried to remove the ids from the index which are non-existent, after that every
peek()
operation causes the warningDelete of nonexisting embedding ID
. @HammadB mentioned warnings can be ignored, but neverthelesspeek()
shouldn't cause them. Relative discussion on Discord.Here is chroma.zip for reproduction.
Versions
Initially, I used
chromadb==0.4.2
, but before creating issues I switched onchromadb==0.4.5
to see if I see the same warnings, same result - I see warnings.python = "^3.9.17"
Relevant log output