superduper-io / superduper

Superduper: Integrate AI models and machine learning workflows with your database to implement custom AI applications, without moving your data. Including streaming inference, scalable model hosting, training and vector search.
https://superduper.io
Apache License 2.0
4.71k stars 459 forks source link

[DistEnv] All outputs loaded in memory before bulk write to database #1626

Open kartik4949 opened 10 months ago

kartik4949 commented 10 months ago

When the model has predicted lets say for 1 million data points, in components/model.py : predict method

model stores the outputs of this 1 million data points into a single list outputs which will OOM when it exceeds memory.

refer : superduper/components/model.py: predict method.

Same thing happens in model inputs All inputs are loaded on memory before passing it to model, inputs are packed into a e.g Dataloader (refer: ext/torch/model.py: _predict method)

We need to chunk the model inputs in the database and iterate over a chunk and pass it for model prediction.

fnikolai commented 10 months ago

As a first step, we can expose the batch size to the user.

Then we can perform optimization to see if we can estimate the ideal batch size automatically based on the available memory.

We also need to account for the parallel processing of multiple workers (are they going to process the same chunk or different ones ?)