chroma-core / chroma

the AI-native open-source embedding database
https://www.trychroma.com/
Apache License 2.0
15.43k stars 1.3k forks source link

[Bug]: Disk I/O error on Google Drive #1319

Open Tylersuard opened 1 year ago

Tylersuard commented 1 year ago

What happened?

I am storing my persistent ChromaDB in Google Drive. Everything was going great until my database file hit around 16GB, now I can't upload any more documents, and I get an error: I/O Disk Error.

Versions

Chroma 0.4.15 in Colab

Relevant log output

OperationalError                          Traceback (most recent call last)

[<ipython-input-15-98b03a86c398>](https://localhost:8080/#) in <cell line: 40>()
     50         start_time = time.time()
     51 
---> 52         collection.add(
     53             ids=[str(uuid.uuid4()) for i in range(0, batch_size)],  # IDs are just strings
     54             documents=[recordo.full_case_text for recordo in cases_batch_list],

4 frames

[/usr/local/lib/python3.10/dist-packages/chromadb/db/mixins/embeddings_queue.py](https://localhost:8080/#) in submit_embeddings(self, topic_name, embeddings)
    170             # the results. https://www.sqlite.org/lang_returning.html
    171             sql = f"{sql} RETURNING seq_id, id"  # Pypika doesn't support RETURNING
--> 172             results = cur.execute(sql, params).fetchall()
    173             # Reorder the results
    174             seq_ids = [cast(SeqId, None)] * len(

OperationalError: disk I/O error
tazarov commented 1 year ago

@Tylersuard, that is interesting. Have you checked whether you've reached your disk usage quota?

image