Sinaptik-AI / pandas-ai

Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.
https://pandas-ai.com
Other
12.46k stars 1.2k forks source link

OpenAI response is Truncated. Seems to be an issue with the max tokens set. #164

Closed manonfire86 closed 1 year ago

manonfire86 commented 1 year ago

I have a dataset of financial factor data that I am having the LLM model analyze. The response is truncated around 450 words; it appears that this may be an issue with the max token set. Is there a method to increase max tokens to receive the full response?

Applied code below:

pandas_ai.run(df, prompt='Analyze this timeseries dataset. The index is a timeseries index over the last 15 years. Please only use the dataframe index and the Factor Group, Factor, and Factor Return (%) columns to analyze the Factor performance history. You need to calculate RSI, Volatility, Correlation, Momentum, Skew, and Kurtosis to complete your analysis. Please include a detailed write up on these statistical calculations and their relation to the Factor columns historical performance. You must include at least 1000 words in your essay. Your essay needs to elaborate on the top 20 and bottom 20 Factors from the Factor column that you selected from your analysis. Explain their statistical relationships to the factor performance.')

gventuri commented 1 year ago

Hey @manonfire86, by default the max amount of tokens is 512. From what I see, your prompt is pretty detailed, so I'd recommend increasing it as you instantiate OpenAI.

You can do it by passing the custom max_tokens params, like this:

llm = OpenAI(max_tokens=2000)

Let me know if it fixes the issue!