Closed AndreasKarasenko closed 5 months ago
Just FYI: if we solve the AttributeError one way or another I can make some headway on using the scikit-learn internals to parallelize the predict function to gain some speedup.
@AndreasKarasenko , regarding the parallelization, implementing it would be a very nice improvement as openai is way more generous with the rate limits and parallel requests won't immediately consume all of the quota. However, I don't think there is a strong need to use any scikit-learn internals for that. predict()
is basically just a for-loop which can be parallelized using the ThreadPool executor (the GIL would also be released during the API call).
I have done the minor updates requested and will work on a seperate PR for the parallelization, not sure how quick I will be though. I agree that threads are a better choice than spawning a whole process. Also no sure how I will handle the KNN part of the Dynamic FSL yet but let's see.
For now I have a rudimentary parallel predict that I'll use for a paper over the weekend. The kneighbors from the memory index should be thread safe, so calling the retrieve from multiple threads should not be an issue. In that sense maybe I will add a PR for this ?
Throttling when too many calls are made would be nice. The issue remains that RPM and TPM depend on the model and are subject to change.
Hi @AndreasKarasenko
Thanks a lot for the changes. I think we can merge now, but will have a closer look a bit later.
Regarding the parallelism:
num_workers
as an init argument. So maybe you can immediately add a deprecation warning or something similar (when num_workers > 1) to indicate that the API is temporary.
Fixes #93. Also propagates key, org to the vectorizer at an earlier stage if it is set through the model. However key, org might be more easily exposed. At the same time currently it is possible to set key, org like below which still exposed them through
__dict__
.