The logits process is relaxed so it allows top_k =1 and top_p=1 even when do_sample=True. These kind of requests was received when testing instruct models such as mistralai/Mistral-7B-Instruct-v0.3 with the inference endpoints web ui.
This should fix #76 and #77.
Version is increased, so I can release that and use the new version with the fix in Inference Endpoints.
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.
What does this PR do?
The logits process is relaxed so it allows top_k =1 and top_p=1 even when do_sample=True. These kind of requests was received when testing instruct models such as
mistralai/Mistral-7B-Instruct-v0.3
with the inference endpoints web ui. This should fix #76 and #77. Version is increased, so I can release that and use the new version with the fix in Inference Endpoints.