unclecode / crawl4ai

🔥🕷️ Crawl4AI: Open-source LLM Friendly Web Crawler & Scrapper
Apache License 2.0
1.27k stars 121 forks source link

Error code 400, GROQ context limit exceeded. #44

Closed haddadnidal closed 1 day ago

haddadnidal commented 2 days ago

I was trying the LLM extraction strategy to extract data from a website using GROQ api as a LLM backend with groq/llama3-70b-8192 and i am having an error

[ERROR] 🚫 Failed to crawl https://www.cen-change.com/achat-vente-devise/#Cours-et-Taux-de-Change, error: litellm.BadRequestError: BadRequestError: GroqException - Error code: 400 - {'error': {'message': 'Please reduce the length of the messages or completion.', 'type': 'invalid_request_error', 'param': 'messages', 'code': 'context_length_exceeded'}} None

I did change the instruction and lowered the lenght of it but still having this error. How can i debug it ? thank you

unclecode commented 2 days ago

That's true. Unfortunately, the version of Llama3 that Groq supports currently has a short context window of 8K. When they release the paid plan, you'll hopefully get a longer context window. However, you can extract the markdown and use chunking strategies to manage it. We might add the ability to summarize content for models with smaller context windows to our backlog. For now, you can use other models that support longer windows, like OpenAI, Claude, or open-source models like ollama/Llama3 or Phi from Microsoft, which offer much longer windows, like 128K. These should be sufficient for most tasks.