Setting the temperature in a particular range causes vllm to generate whitespace-only outputs. Values above/below this range work correctly. I have seen this with facebook/opt-125m, fine-tuned mistral-7B models, codellama-13B, and several other models. It seems like this is an issue with vllm rather than the particular model:
To reproduce:
python -m vllm.entrypoints.openai.api_server --model facebook/opt-125m
With temperature:
1e-3: Generates " great place to live. I"
1e-4: Generates "\\\\\\\"
1e-5: Generates "\\\\\\\"
1e-6: Generates " great place to live. I"
Setting the temperature in a particular range causes vllm to generate whitespace-only outputs. Values above/below this range work correctly. I have seen this with facebook/opt-125m, fine-tuned mistral-7B models, codellama-13B, and several other models. It seems like this is an issue with vllm rather than the particular model:
To reproduce:
python -m vllm.entrypoints.openai.api_server --model facebook/opt-125m
Send request:
With temperature:
1e-3
: Generates " great place to live. I"1e-4
: Generates "\\\\\\\"1e-5
: Generates "\\\\\\\"1e-6
: Generates " great place to live. I"