fixie-ai / thefastest.ai

Website with current metrics on the fastest AI models.
MIT License
30 stars 3 forks source link

OVHcloud AI Endpoints benchmark #24

Closed Joffref closed 3 months ago

Joffref commented 3 months ago

Recently OVHcloud, released a new product called AI Endpoints.

This product offers LLM on the shelf with a OpenAI compatible API.

For example, you can call Mixtral-8x22b using the following code snippet:

curl -X 'POST' \
  'https://mixtral-8x22b-instruct-v01.endpoints.kepler.ai.cloud.ovh.net/api/openai_compat/v1/chat/completions' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "max_tokens": 100,
  "messages": [
    {
      "content": "Hello, how are you today?",
      "name": "John",
      "role": "user"
    }
  ]
}'

That'd be amazing to see it benched! I'd love to contribute to bring this feature up, how could I do?

Joffref commented 3 months ago

For more context, there are the following models exposed through this service:

juberti commented 3 months ago

If it's OpenAI-compatible then all that's needed is to add entries for the desired models (I'd suggest the Llama 3 variants to start) to llm_benchmark_suite.py and generation of an API key that we can use in our runner.

juberti commented 3 months ago

hmm, looks like an API key isn't actually needed. Added in https://github.com/fixie-ai/ai-benchmarks/pull/78, should start showing up in stats tomorrow (2024-06-12)

Joffref commented 2 months ago

Awesome work done here, it's been a few days I'm daily watching those metrics. Have you any idea why, when I'm clicking on ovh.net, it redirects me to a sort of speedtest instead of the official website? :+1:

juberti commented 2 months ago

It's just sending you to https://ovh.net, if there's another link we should use instead just LMK.

Joffref commented 2 months ago

I think you should use https://endpoints.ai.cloud.ovh.net/ instead.

juberti commented 2 months ago

Care to make a PR? Just change line 207 here: https://github.com/fixie-ai/ai-benchmarks/pull/78/files