Open merplumander opened 3 weeks ago
Based on the information above we can do an evaluation with data starting from August (the 1st):
After a first test, we decide to exclude gpt-4-turbo (too expensive), claude-3-5-haiku (worse than claude-3-5-sonnet) and llama3.2-11b (as we already have to 90b version)
Language Model Overview
OpenAI
Logprobs: yes
Anthropic
Logprobs: no
Gemini
Problem: Knowledge cut-off information not available
Logprobs: No
Llama
Grok
grok-beta
No further information available without X-premium.
Mistral
No logprobs from api available.
Qwen
Very good documentation for the open source downloadable models, but almost none for commercial models available via the API. It would be great to run the open source versions but they are too large without additional resources.