Open DrorMarkus opened 2 months ago
Hi @DrorMarkus, thanks for raising this! Sorry that this got in the way of your workflow. I'll look into this and see if we can gracefully handle rate limits for the scoring. As a workaround in the meantime, it might be helpful to use the batch_size
argument in the call to the score()
function (code here), like this:
# Current default batch_size=1
score_df = await l.score(batch_size=5)
This will batch together multiple text examples into a prompt, which could reduce the number of independent calls to the OpenAI API. The downside is that, from what I've seen in my testing, the scoring accuracy sometimes can suffer when multiple examples are batched together. I plan to add more info on additional arguments like this in the documentation.
As another update re: your first line, last night I updated the package to depend on python 3.10 and to use an updated OpenAI version >=1.23.1
! Let me know if that's helpful for your setup. It should be available as version 0.7.1
of text_lloom
.
Thank you @michelle123lam! Regarding your issue @DrorMarkus, I exceeded gpt-3.5-turbo limitations even prior to scoring (the first, i.e. distilling step). And I should specify that it's the TPM limit, not RPM, that's the current block. My number of rows isn't huge (N < 2000), each containing a few sentences.
Looking for a solution now (other than using gpt-4 with more generous limits, though that partly depends on things like the tier of your account).
@DrorMarkus @zilinskyjan
As an update, I've added functionality to the lloom instance creation so that users can (1) specify which models are used for the operators and (2) specify custom rate-limit parameters! These changes are available in text_lloom
version 0.7.2
.
Specifying which models are used:
l = wb.lloom(
df=df,
text_col="text",
# Model specification
distill_model_name = "gpt-3.5-turbo",
embed_model_name = "text-embedding-3-large",
synth_model_name = "gpt-4-turbo",
score_model_name = "gpt-3.5-turbo",
)
Specifying custom rate-limit parameters:
l = wb.lloom(
df=df,
text_col="text",
# Rate limit parameters dict
# "model-name": (n_requests, wait_time_secs)
rate_limits={
"gpt-4-turbo": (40, 10), # Specify any custom parameters, otherwise they default to the settings in llm.py
}
)
n_requests
: number of requests allowed in one batchwait_time_secs
: time period (in seconds) to wait before making more requestsn_requests
* (60 / wait_time_secs
)I'll be adding in information about this in the documentation as well.
I attempted to run Lloom on a sample corpus consisting of news articles (I am running on python 3.9 and downgraded the openAI version as stated in the instructions).
When I first tried to run, I received the following error when attempting distillation:
Since it appeared there were token limitation issues, I tried to cut down my texts to only the headlines of the articles. The procedure then worked, running through the distillation, clustering and synthesis stages. I receive the 5 concepts:![image](https://github.com/michelle123lam/lloom/assets/167772113/7e6b6d19-25af-45b0-8883-e35ab599dab3)
However, upon proceeding with the scoring, the procedure gets stuck:![image](https://github.com/michelle123lam/lloom/assets/167772113/43af54b0-872a-485b-8c64-99085642e344)
Again there are token limit errors. It appears that the multi_query_gpt_wrapper works for the distillation, but not for the scoring.