llm-as-judge Search Results

548 results
for llm-as-judge

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

AkihikoWatanabe/paper_notes #1304

Benchmarking Large Language Models for News Summarization, T…

# URL - https://arxiv.org/abs/2301.13848 # Affiliations - Tianyi Zhang, N/A - Faisal Ladhak, N/A - Esin Durmus, N/A - Percy Liang, N/A - Kathleen McKeown, N/A - Tatsunori B. Hashimoto, N/A…

AkihikoWatanabe updated 6 months ago
1
mlflow/mlflow #11384

ValueError: Argument `prompt` is expected to be a string. In…

### Issues Policy acknowledgement - [X] I have read and agree to submit bug reports in accordance with the [issues policy](https://www.github.com/mlflow/mlflow/blob/master/ISSUE_POLICY.md) ### Where…

TanzeelAbbas updated 8 months ago
2
junhwi/next-gen-ai #6

24/01/03

SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling https://arxiv.org/abs/2312.15166 Fast Inference of Mixture-of-Experts Language Models with Offloading https://pa…

junhwi updated 10 months ago
2
InflectionAI/Inflection-Benchmarks #2

Discrepancy in the reported percentage of flawed questions i…

Hi, I was reading the article [Inflection-2.5: meet the world's best personal AI](https://inflection.ai/inflection-2-5), and in the article it was mentioned that `nearly 25%—of examples in the reason…

jerilkuriakose updated 8 months ago
3
irthomasthomas/undecidability #882

“Emergent” abilities in LLMs actually develop gradually and …

- [ ] [“Emergent” abilities in LLMs actually develop gradually and predictably – study | Hacker News](https://news.ycombinator.com/item?id=39811155) # "Emergent" abilities in LLMs actually develop gr…

ShellLM updated 3 months ago
1
AkihikoWatanabe/paper_notes #1212

Self-Rewarding Language Models, Weizhe Yuan+, N/A, arXiv'24

# URL - https://arxiv.org/abs/2401.10020 # Affiliations - Weizhe Yuan, N/A - Richard Yuanzhe Pang, N/A - Kyunghyun Cho, N/A - Sainbayar Sukhbaatar, N/A - Jing Xu, N/A - Jason Weston, N/A #…

AkihikoWatanabe updated 10 months ago
1
huggingface/lighteval #139

[New Task] Add AlpacaEval LC

Great library, a light library for all the main evals was really needed!💯 I just came across this [line](https://github.com/huggingface/lighteval/blob/af24080ea4f16eaf1683e353042a2dfc9099f038/src/…

YannDubs updated 6 months ago
9
zilliztech/GPTCache #652

Possible Leakage of Private Info in Semantic Cache Requests

Dear GPTCache Team, we are a security research group. We've used GPTCache for a while and impressed by its design and speed, but as we studied further, more concerns about the security of GPTCache ha…

Unik-lif updated 2 months ago
1
X-lab2017/open-research #271

[例会] 语言模型在合成数据的最佳实践和经验教训

### Title Best Practices and Lessons Learned on Synthetic Data for Language Models ### Link [Best Practices and Lessons Learned on Synthetic Data for Language Models.pdf](https://github.com/X-lab…

Peng99999 updated 6 months ago
1
castorini/umbrela #1

Feature specs discussion board for umBRELA

I am starting this thread for feature spec discussion for umBRELA @lintool @ronakice. Suggestions from my side: - parameter for specifying the number of samples for inference and later performin…

UShivani3 updated 6 months ago
5

上一页 1...6 7 8 9 10 11 12...55 下一页

548 results for llm-as-judge

548 results
for llm-as-judge