OpenFn / apollo

GNU Lesser General Public License v2.1
0 stars 2 forks source link

Compare chat models #82

Closed josephjclark closed 2 months ago

josephjclark commented 4 months ago

Prepare a suite of questions, tune prompts, compare results

Allow us to pick the one that performs the best

Maybe consider ethical issues, scale up costs, token donation

josephjclark commented 3 months ago

I've been use Claude 3.5 sonnet a fair bit over the last week and I find its performance to be really good. My testing is far from exhaustive. I have a feeling GPT-4 might be slightly better, but I also think that Anthropic's values of safety and openness align better with our values. So I want to recommend Claude - at least for a while.

I want to note here that Anthropic have no embeddings API, which might be significant for RAG. They recommend voyage.ai

Pricing in GPT and Claude is comparable: claude is $3/million input tokens, gpt4 is $5/million input tokens, and the are both $15/million output tokens.

I can't work out how to contact Anthropic to discuss free tokens, although I'd suggest we get signed up and start using it within the team first.