llm-evaluation Search Results

1000+ results
for llm-evaluation

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

explodinggradients/ragas #955

[R-254] Issue in Evaluation using local LLM

[ ] I checked the [documentation](https://docs.ragas.io/) and related resources and couldn't find an answer to my question. **Your Question** > “WARNING:ragas.llms.output_parser:Failed to parse …

sheetalkamthe55 updated 2 weeks ago
2
strickvl/mlops-dot-systems #10

posts/2024-06-25-evaluation-finetuning-manual-dataset

# Alex Strick van Linschoten - How to think about creating a dataset for LLM finetuning evaluation I summarise the kinds of evaluations that are needed for a structured data generation task. [https:…

utterances-bot updated 1 week ago
3
dynamic-superb/dynamic-superb #208

[Task] Interactive Data Analysis

# Task Name Interactive Data Analysis ## Task Objective Interactive Data Analysis, a collaboration between humans and Large Language Model (LLM) agents, enables real-time data exploration for…

Nan-Huo updated 6 days ago
4
tamlhp/awesome-machine-unlearning #46

New paper(s) on machine unlearning

1. mismatched machine unlearning Title: Decoupling the Class Label and the Target Concept in Machine Unlearning arXiv: https://arxiv.org/abs/2406.08288 2. evaluation of LLM unlearning Title: Unl…

ZFancy updated 5 hours ago
1
explodinggradients/ragas #1057

Exception in thread Thread-4: `asyncio.exceptions.CancelledE…

I have checked the [documentation](https://docs.ragas.io/) and related resources and couldn't resolve my bug. **Describe the bug** I am trying to run the template code from the Github ReadMe page.…

larry-ziyue-yin updated 1 hour ago
2
stevenyangyj/Emma-Alfworld #5

About code release plan

Thank you for your great work about LLM in agent. I would like to know when you will release all of the code (include implementation and evaluation code)? Thank you.

eeaurora updated 1 week ago
1
promptfoo/promptfoo #1039

Is there a way to run asserts/evaluations dynamically based …

So I'm trying to evaluate a llm response ad hoc. I have multiple asserts like: A: Check enum is in results for "Input A" in prompt B: Check result is sql for Input B C: Check there is LI…

SysOverdrive updated 3 days ago
1
explodinggradients/ragas #1052

Parallelization conflict in evaluate function

When I run `evaluate` with any model of VertexAI, I get several warnings that say > Gapic client context issue detected.This can occur due to parallelization. And sometimes the execution of eva…

aeronesto updated 1 week ago
2
codingburgas/chatbot-app-cpi-atesh #29

Model evaluation

slavyolov updated 1 day ago
2
strickvl/mlops-dot-systems #13

posts/2024-07-01-full-finetuned-model-evaluation

# Alex Strick van Linschoten - My finetuned models beat OpenAI’s GPT-4 Finetunes of Mistral, Llama3 and Solar LLMs are more accurate for my test data than OpenAI’s models. [https://mlops.systems/pos…

utterances-bot updated 3 days ago
2

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for llm-evaluation

1000+ results
for llm-evaluation