Open satkg42 opened 9 months ago
Hi @satkg42, the no_repeat_ngram_size
is not a valid parameter on a TGI server. See the docs for a full list of available parameters. The link you've shared above is for transformers
pipeline. For the record, TGI server and transformers
library do not share the same codebase. We are on the ongoing process to unify the API (cc @SBrandeis) but parameters with low usage (like no_repeat_ngram_size
) will most likely not be supported. As a consequence, it is not supported in Python's InferenceClient.text_generation
.
Thanks @Wauplin for quick response. I agree that it might not be one of the most used parameters. But in my experience it definitely helps model "rambling" problem, i.e. model going on and on about something. It would be really helpful if this parameter is supported in the API. Let me know if I can contribute in doing so.
Thanks for the details @satkg42. For now, I would prefer to delay the decision. Adding support for new parameters is not the hardest part. Maintaining them and adding backward compatibility when we want to update/remove them is much harder. That's why I'd rather wait until the "API unification" step is done on our part before adding this. In the meantime, I encourage any user landing on this page to post a comment to show their interest in such an addition.
@satkg42 a possible workaround for you in the meantime is to use the .post
method. This is more manual work but would do the same for you:
from huggingface_hub import InferenceClient
client = InferenceClient(...)
response = client.post(json={"inputs": "this is my test", "parameters": {"no_repeat_ngram_size": 42}}
data = json.loads(response.decode())
...
However, this is not compatible with TGI-served models.
Describe the bug
I am using InferenceClient for generating text with tgi endpoint. To control the repetitive generations we had to use
no_repeat_ngram_size
parameter. But I am getting below errorTypeError: InferenceClient.text_generation() got an unexpected keyword argument 'no_repeat_ngram_size'
But it is supported in GenerationConfig
Reproduction
No response
Logs
System info