evaluate-llm Search Results

1000+ results
for evaluate-llm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

AILab-CVC/SEED-Bench #12

VLMs vs LLMs evaluation

Hello 👋 First of all thank you for the great work and evaluation results! I have understood that in many cases you predicted outputs for each question based on the choice that minimizes the loss…

idan-tankel updated 11 months ago
1
AkihikoWatanabe/paper_notes #1477

On The Planning Abilities of OpenAI's o1 Models: Feasibility…

# URL - https://www.arxiv.org/abs/2409.19924 # Affiliations - Kevin Wang, N/A - Junbo Li, N/A - Neel P. Bhatt, N/A - Yihan Xi, N/A - Qiang Liu, N/A - Ufuk Topcu, N/A - Zhangyang Wang, N/…

AkihikoWatanabe updated 3 weeks ago
1
fiatrete/OpenDAN-Personal-AI-OS #90

Open source LLM performance evaluation

I will list the test results of various open-source models here. You can refer to these data to select models and configure devices. Of course, the evaluation of LLM is quite subjective. I also sugges…

streetycat updated 9 months ago
21
llm-jp/experiments #60

[評価] - llm-jp-eval 1.4.1による統合評価

# Overview llm-jp-eval 1.4.1を各種モデルで実施するための統合実験。 # Details ## 実験の実施手順 1. 評価を行いたいモデルのHugging Face形式チェックポイントを用意してください。 1. チェックポイントのパスと評価タスク名を本issueのコメントとして投下してください。 1. @odashi がsakura側で評価実験…

odashi updated 1 month ago
1
ansible-community/ansible-london-meetup #90

Let's use RAG for Ansible coding and say goodbye to tedious …

## Talk title Let's use RAG for Ansible coding and say goodbye to tedious tasks! ## Talk Description Writing comprehensive documentation and extensive Molecule test cases is essential when bu…

benfab updated 3 weeks ago
1
AkihikoWatanabe/paper_notes #1517

A Large-Scale Study of Relevance Assessments with Large Lang…

# URL - https://arxiv.org/abs/2411.08275 # Authors - Shivani Upadhyay - Ronak Pradeep - Nandan Thakur - Daniel Campos - Nick Craswell - Ian Soboroff - Hoa Trang Dang - Jimmy Lin # Abst…

AkihikoWatanabe updated 1 week ago
3
explodinggradients/ragas #1423

Error: xception raised in Job[0]: APIConnectionError(Connect…

[ ] I have checked the [documentation](https://docs.ragas.io/) and related resources and couldn't resolve my bug. **Describe the bug** I set up a defination to split the input Dataset into several…

francescofan updated 1 month ago
9
NERC-CEH/plankton_ml #44

Alternative vector store to ChromaDB?

Been wondering about alternative vector stores - we picked Chroma because it looked very simple API-wise, sqlite-based and there were lots of examples for integrating it with LangChain in the [LLM eva…

metazool updated 1 week ago
1
AIHawk-FOSS/Auto_Jobs_Applier_AI_Agent #663

Feature request #1 interview_prep

### Feature summary HawkAI_preps ### Feature description **#1** Voice assistant to ask questions based on required skills or experience. **#2** Difficulty levels based mock evaluations. ### M…

gtgtian updated 1 week ago
3
triton-inference-server/tensorrtllm_backend #624

Garbage response when input tokens is longer than 4096 on Ll…

### System Info NVIDIA A100 40 GB ### Who can help? @byshiue @ka ### Information - [X] The official example scripts - [ ] My own modified scripts ### Tasks - [X] An officially supported task in…

winstxnhdw updated 4 days ago
2

上一页 1...11 12 13 14 15 16 17...100 下一页

1000+ results for evaluate-llm

1000+ results
for evaluate-llm