junhwi / next-gen-ai

0 stars 0 forks source link

24/10/20 #46

Open junhwi opened 1 month ago

junhwi commented 1 month ago

JudgeBench: A Benchmark for Evaluating LLM-based Judges

https://arxiv.org/abs/2410.12784

shylee2021 commented 1 month ago

OpenAI Swarm https://github.com/openai/swarm

Thinking LLMs: General Instruction Following with Thought Generation https://arxiv.org/abs//2410.10630v1