Set up a token (or words) limit to be sent to the LLM

Is your feature request related to a problem? Please describe. I just ran into a problem when Vanna.ai tried to make a pretty big request to answer a single question:

Extracted SQL: SELECT BARBER_ID, COUNT(*) AS schedule_count
FROM SCHEDULES
GROUP BY BARBER_ID
Using model gpt-4o for **1,349,982.75** tokens (approx)

That's more than a million tokens!

Luckily, OpenAi sent me 429 error: openai.RateLimitError: Error code: 429 - {'error': {'message': 'Request too large for gpt-4o in organization org-2.... on tokens per min (TPM): Limit 30000, Requested 1349985.

Describe the solution you'd like I'd like to be able to pass a parameter specifying the max size in tokens that I'm comfortable with.

Describe alternatives you've considered Setting up a limit at OpenAi - I'm not an admin there.

Additional context I tried to connect Vanna AI to our prod db with 100+ tables and a lot of data. I believe the error could have appeared due to "let Vanna ai send your data to LLM" flag.

vanna-ai / vanna

Set up a token (or words) limit to be sent to the LLM #577