explosion / spacy-llm

🦙 Integrating LLMs into structured NLP pipelines
https://spacy.io/usage/large-language-models
MIT License
1.01k stars 77 forks source link

[Warning] the current text generation call will exceed the model's predefined maximum length (4096). #423

Open yileitu opened 5 months ago

yileitu commented 5 months ago

When using LLM to do NER task, there is a warning saying "This is a friendly reminder - the current text generation call will exceed the model's predefined maximum length (4096). Depending on the model, you may observe exceptions, performance degradation, or nothing at all."

How to change the maximum length of the LLM output?

[components.llm.task]
@llm_tasks = "spacy.NER.v3"

[components.llm.model]
@llm_models = "spacy.Llama2.v1"
name = "Llama-2-7b-hf"
rmitsch commented 5 months ago

Hi @yileitu, all model parameter are forwarded to transformers that handles the model. In most cases there is a max_length or max_new_tokens parameter you can set:

[components.llm.model]
@llm_models = "spacy.Llama2.v1"
name = "Llama-2-7b-hf"
max_length = 8192  # or any other value you want to set
yileitu commented 5 months ago

Hi @rmitsch. Thanks for your replies. However this does not work.

Config validation error
llm.model -> max_new_tokens extra fields not permitted
{'@llm_models': 'spacy.Llama2.v1', 'name': 'Llama-2-7b-hf', 'max_new_tokens': 10000}

Neither does max_length.