codelion / optillm

Optimizing inference proxy for LLMs
Apache License 2.0
1.6k stars 128 forks source link

deepseek can't support n > 1 #83

Open femto opened 2 weeks ago

femto commented 2 weeks ago

Hello, while trying bon, deepseek reports error since deepseek can't support n > 1, so I added

    except:
        for _ in range(n):
            response = client.chat.completions.create(
                model=model,
                messages=messages,
                max_tokens=4096,
                n=1,
                temperature=1
            )
            completions.extend([choice.message.content for choice in response.choices])
            bon_completion_tokens += response.usage.completion_tokens
codelion commented 2 weeks ago

This is good, can you also update it for moa, mcts and pvg approaches? It should then fix #67 .

codelion commented 3 days ago

Another request for something like this with some discussion is here #99