llm-evaluation Search Results

1000+ results
for llm-evaluation

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

explodinggradients/ragas #1234

Can Ragas be used to evaluate Amazon Bedrock Agents w/ manag…

[X] I checked the [documentation](https://docs.ragas.io/) and related resources and couldn't find an answer to my question. **Your Question** what is unclear to you? What would you like to know? …

danielesalvatore updated 2 weeks ago
5
tjunlp-lab/Awesome-LLMs-Evaluation-Papers #18

Could you add PandaLM to your survey?

PandaLM: An Automatic Evaluation Benchmark for LLM Instruction Tuning Optimization. PandaLM is the first to evaluate llm using a finetuned llm.

qianlanwyd updated 9 months ago
1
microsoft/JARVIS #208

Evaluation Dataset mentioned in Hugging GPT paper is not ava…

As mentioned in the paper - "Furthermore, we also invite some expert annotators to label task planning for some complex requests (46 examples) as a high-quality human annotated dataset. We also plan t…

ssdasgupta updated 4 weeks ago
2
mnm-matin/ai_alignment_graph #11

Next Release of the AI Alignment Research Graph

Just one issue to track progress till the next release and collaborate on creating the right issues. Feel free to edit this issue/comment on changes. Scope of Next Release: - [ ] For each paper cr…

mnm-matin updated 1 week ago
1
zilliztech/GPTCache #567

[Bug]: Exception when using LangChain with GPTCache

### Current Behavior When following the LangChain instructions from the docs for a custom LLM I'm getting: ``` File "gptcache/processor/pre.py", line 20, in last_content return data.get("m…

dwillie updated 5 months ago
6
microsoft/LLMLingua #155

[Question]: Reproduce LLMLingua-2 results with Mistral-7B

### Describe the issue First of all, thank you for your great contributions. I have a similar question to the [issue 146](https://github.com/microsoft/LLMLingua/issues/146), I cannot reproduce the…

xvyaward updated 2 days ago
4
sgl-project/sglang #1316

[Bug] Unable to fix model output

### Checklist - [x] 1. I have searched related issues but cannot get the expected help. - [x] 2. The bug has not been fixed in the latest version. - [x] 3. Please note that if the bug-related issue y…

cherishhh updated 3 days ago
16
opendilab/LMDrive #37

Performance metrics in the paper

Hi, thanks for your nice work. I have a question about reproducing the driving score shown in the paper. I run the evaluation with the following configurations: ``` preception_model = 'memfuser_…

zacz08 updated 3 months ago
4
explodinggradients/ragas #1099

Getting Error: Runner in Executor raised an exception in Rag…

I encountered an issue while evaluating a dataset using the ragas library with the Langchain LLM and Sentence Transformer embeddings. The process throws an exception during execution. **Steps to Repr…

divrajput updated 1 month ago
4
dotnet/ai-samples #36

Provide a model/prompt evaluation building block for .NET de…

Today, the [Python Evaluation building block](https://aka.ms/azai/eval) can be used against a .NET backend that uses the Chat Protocol (Azure Search supports this). However, we know from customer feed…

JakeRadMSFT updated 5 months ago
2

上一页 1...14 15 16 17 18 19 20...100 下一页

1000+ results for llm-evaluation

1000+ results
for llm-evaluation