llm-as-judge Search Results

550 results
for llm-as-judge

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

Qiskit/qiskit #13285

Add an efficient circuit synthesis for a special pattern `ex…

### What should we add? [1] proposed an efficient way to synthesize the following special pattern of two-body terms (`IZZ` and `ZZI`) and three-body term (`ZZZ`) (Fig. 2 [1]). It can halve the number…

t-imamichi updated 1 day ago
13
sail-sg/oat #10

Reproducing the results of Simpo and dpo

Hi, Thanks for your great work. I try to reproduce the results of offline dpo and offline simpo and I found the reproduced resltus are better the results in the paper. For example, for the resul…

lucasliunju updated 1 day ago
9
defenseunicorns/leapfrogai #723

EPIC: RAG Evaluations MVP

# RAG Evaluations MVP ## Description LFAI needs a framework for evaluations in order to: - Validate the efficacy of RAG at all stages - Make model recommendations for various scenarios - Establish a …

jalling97 updated 1 month ago
1
langchain-ai/langsmith-docs #342

Support my own judge model? --custom judge model

Hi there, I am wondering does the llm-as-a-judge evaluation from LangSmith support customized my own model as a judge? I wish to develop my custom prompts for my own judge model through langsmith. …

aiyinyuedejustin updated 2 months ago
1
meta-introspector/tpUlysses #1

sample the runtime

An expert in TPU compiler writing can potentially introduce sampling techniques into programs for specific purposes. Here's a breakdown of the concept: **Sampling for TPU Programs:** * **Expert-…

jmikedupont2 updated 8 months ago
28
mdn/yari #9208

MDN can now automatically lie to people seeking technical in…

### Summary MDN's new "ai explain" button on code blocks generates human-like text that may be correct by happenstance, or may contain convincing falsehoods. this is a strange decision for a techn…

eevee updated 1 year ago
115
open-compass/opencompass #1379

How to set temperature when use llm as judge?

### Describe the feature Do I need to set temperature = 0 when I try to use llm as judge. Otherwise, every time the score is different. ### Will you implement it? - [ ] I would like to implement th…

may012345 updated 3 months ago
2
Kipok/NeMo-Skills #185

error while reproducing results (llm_math_judge)

hi, i've followed the steps indicated in `reproducing-results.md`. For generating the greedy results i did run only math and gsm8k with ```ns eval \ --cluster=local \ --model=/workspace/…

yld3 updated 1 month ago
6
freelawproject/foresight #22

Add sentiment analysis to oral argument transcripts

Now that we've got about 100k transcripts in our oral argument collection, perhaps a next step would be to add sentiment analysis. I think this is pretty easy stuff these days either through an AI cal…

mlissner updated 1 month ago
19
intel-analytics/ipex-llm #11424

vLLM freezes with gpu-memory-utilization > 0.55

Running vllm according to instructions. Docker segfaults at startup, so I'm running straight on the machine. Starting server with the following shell script. As you can see I've tried to turn max…

nathanodle updated 3 months ago
4

上一页 1...19 20 21 22 23 24 25...55 下一页

550 results for llm-as-judge

550 results
for llm-as-judge