[datasets][use-case] using different LLMs

Arize-ai / phoenix

AI Observability & Evaluation

https://docs.arize.com/phoenix

Other

2.98k stars 212 forks source link

Closed axiomofjoy closed 17 hours ago

axiomofjoy commented 1 week ago

Comparing the performance of different LLMs on a fixed prompt, e.g., GPT-3.5 vs. GPT-4.

mikeldking commented 5 days ago

This can be tacked on at the end of a different notebook I think.

mikeldking commented 17 hours ago

It's in other notebooks