mistralai / mistral-inference

Official inference library for Mistral models
https://mistral.ai/
Apache License 2.0
9.37k stars 817 forks source link

Incomplete Output even with max_new_tokens #92

Open pradeepdev-1995 opened 7 months ago

pradeepdev-1995 commented 7 months ago

So the output of my finetuned mistral model ends abruptly and I ideally want it to complete the paragraph/sentences/code which it was it between of. Although I have provided max_new_tokens = 300 and also in prompt I give to limit by 300 words.

The response is always big and ends abruptly. Any way I can ask for a complete output within desired number of output tokens?

here is the given generationconfig

generation_config = GenerationConfig(
    do_sample=True,
    top_k=10,
    temperature=0.01,
    pad_token_id=tokenizer.eos_token_id,
    early_stopping = True,
    max_new_tokens=300,
    return_full_text=False
)