Dicklesworthstone / llm_aided_ocr

Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.
2.01k stars 123 forks source link

Negative max_tokens value #11

Open chew-z opened 1 month ago

chew-z commented 1 month ago

I am getting negative max tokens in AI call.. Some small bug probably...

2024-08-12 09:54:03,183 - ERROR - An error occurred while processing a chunk: Error code: 400 - {'error': {'message': "Invalid 'max_tokens': integer below minimum value. Expected a value >= 1, but got -1971 instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': 'integer_below_min_value'}}
2024-08-12 09:54:03,420 - ERROR - An error occurred while processing a chunk: Error code: 400 - {'error': {'message': "Invalid 'max_tokens': integer below minimum value. Expected a value >= 1, but got -1971 instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': 'integer_below_min_value'}}
GuoMonth commented 3 weeks ago

I encountered the same error.

I found that problems would occur when I went to this code branch.

if adjusted_max_tokens <= 0:

I think the error is caused by the logic of this code.

async def generate_completion_from_openai(
    prompt: str, max_tokens: int = 5000
) -> Optional[str]:
...
adjusted_max_tokens = min(
        max_tokens, 4096 - prompt_tokens - TOKEN_BUFFER
    )
...

response = await openai_client.chat.completions.create(
                    model=OPENAI_COMPLETION_MODEL,
                    messages=[{"role": "user", "content": chunk}],
                    max_tokens=adjusted_max_tokens, # there error
                    temperature=0.7,
                )
...