llm-as-judge Search Results

548 results
for llm-as-judge

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

rmusser01/tldw #293

Feature Add: LLM as a Judge

Issue to track research/validation/testing around 'LLM as a Judge' Grouse - https://github.com/illuin-tech/grouse?tab=readme-ov-file - https://arxiv.org/abs/2409.06595 - Unsorted - https://came…

rmusser01 updated 1 month ago
2
AkihikoWatanabe/paper_notes #1431

Evaluating the Effectiveness of LLM-Evaluators (aka LLM-as-J…

https://eugeneyan.com/writing/llm-evaluators/

AkihikoWatanabe updated 1 month ago
1
openreasoner/openr #42

Support LLM-guided Self-Refinement MCTS

### Feature request Support LLM-guided Self-Refinement MCTS inference method. It has the following features: - LLM-as-Judge to provide review - Proposer LLM generates rewriting of the answer, taki…

YanSong97 updated 1 week ago
1
mc-bench/orchestrator #1

MVP Feature Set

wanted to document what I think a good MVP state would be for this repo: 1. Spin up a unique server for a single LLM 2. Verify server is running correctly 3. Create a base for the LLM to build on …

nikshepsvn updated 1 week ago
4
huggingface/lighteval #379

[FT] Evaluation using a multi-document RAG based on statisti…

## Issue encountered It would be good to have a system for evaluating both the relevance of the RAG and its use by the LLM in producing the response. My first intuition would be a multi-stage system …

louisbrulenaudet updated 2 weeks ago
2
Psycoy/MixEval #45

How to use a OS/local Model for LLM as Judge Response Evalua…

HI @Psycoy I need to have the option to evaluate the Benchmark with an Open Source Model as LLM-Judge. ~~How Can I do that, if this is note possible shall we work on a PR?~~ I have started a PR:…

carstendraschner updated 1 week ago
1
mlflow/mlflow-website #77

Evaluate LLMs with custom metrics with LLM as a judge

## Summary This template is intended to capture a few base requirements that are needed to be met prior to filing a PR that contains a new blog post submission. Please fill out this form in its…

iRahulPandey updated 5 months ago
2
evidentlyai/evidently #1341

Add a parameter to use LLM as a Judge Descriptor with privat…

**Description** Hi team, I am exploring Evidently AI for LLM Evaluation and I came across custom LLM as a Judge Descriptor in which I am particularly interested in. The current API only allows o…

syadavlinklaters updated 1 month ago
2
EleutherAI/lm-evaluation-harness #2233

TODOs for Implementing LLM-as-a-Judge in Eval-Harness (Work …

@haileyschoelkopf @lintangsutawika @baberabb The following is a list of TODOs to implement LLM-as-a-Judge in Eval-Harness: **TLDR** * Splits existing `evaluate` function into `classification_e…

SeungoneKim updated 2 weeks ago
1
h2oai/h2o-llmstudio #694

[FEATURE] Use local LLM deployment as Judge

### 🚀 Feature Additional to the hardcoded models, add support to use local models as judges for evaluation. Can be simplified to require the OpenAI API. Should be basically an endpoint selection,…

pascal-pfeiffer updated 5 months ago
2

上一页 1...1 2 3 4 5 6 7...55 下一页

548 results for llm-as-judge

548 results
for llm-as-judge