amikos-tech / chromadb-data-pipes

ChromaDB Data Pipes 🖇️ - The easiest way to get data into and out of ChromaDB
https://datapipes.chromadb.dev/
MIT License
11 stars 2 forks source link

Throttle import input #123

Open tazarov opened 7 months ago

tazarov commented 7 months ago

The problem:

When feeding in a large dataset, unless we cap the amount of data consumed e.g. via blocking queue to the import (also affects other nodes) the memory consumption of nodes is technically unbounded.

To fix this issue we'll need a common mechanism for reading data in such a way that after we reach a certain threshold (default + config via env var) we throttle the consumption of new input