prrao87 / db-hub-fastapi

Async bulk data ingestion and querying in various document, graph and vector databases via their Python clients
MIT License
33 stars 3 forks source link

Improve indexing performance #12

Closed sanders41 closed 1 year ago

sanders41 commented 1 year ago

I did some testing and was able to improve times for sending documents from ~11.2 seconds to ~9.5 seconds by sending the documents concurrently and doing the settings update concurrently. Admittedly my testing was not proper bench marking, I used time pytest build_index.py and took and average.

Additionally updating the settings before sending the documents should be faster on the Meilisearch side.

prrao87 commented 1 year ago

This looks great, and makes a lot of sense! I'll look into adding further testing but for now I'm very much doing a similar average timing across runs to get an idea of performance. Thanks for writing such a great async package!