evaluate-llm Search Results

1000+ results
for evaluate-llm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

rmusser01/tldw #29

Improvement: Add eval test for summarization across differen…

As a user, I would like to be informed about the summarization effectiveness of my chosen LLM endpoint. I would like to be able to evaluate an endpoint against a known, tested framework, to evaluat…

rmusser01 updated 1 week ago
2
ethz-spylab/rlhf-poisoning #8

Evaluation Dataset

Hello, I would like to ask how to create an evaluation dataset. When I directly run `python evaluate_generation_model.py --model_path ../../LLM_Models/poison-7b-SUDO- --token SUDO --report_path ./…

chiayi-hsu updated 1 week ago
5
MARIO-Math-Reasoning/Super_MARIO #16

About Training data generation.

# llm run for step evaluation prompts, prompts_span = self.value_preprocess(valid_solvers) After executing this line, prompt always got [], and prompts_span got all-zeros list, which makes the tr…

George-Chia updated 1 week ago
1
langchain-ai/langsmith-docs #342

Support my own judge model? --custom judge model

Hi there, I am wondering does the llm-as-a-judge evaluation from LangSmith support customized my own model as a judge? I wish to develop my custom prompts for my own judge model through langsmith. …

aiyinyuedejustin updated 1 week ago
1
amosproj/amos2024ss08-cloud-native-llm #81

Implement Quantitative Evaluation Script

## User story 1. As a data engineer, 2. I want / need to implement and automate the calculation of key performance metrics 3. So that we can iteratively evaluate the performance of our LLM in answerin…

grayJiaaoLi updated 2 weeks ago
2
codingburgas/chatbot-app-cpi-atesh #29

Model evaluation

slavyolov updated 2 weeks ago
2
explodinggradients/ragas #1090

RAGAS with huggingface models

**Describe the bug** A clear and concise description of what the bug is. I tried using RAGAS with a model that is not OpenAI. In general whatever model I use I get this error back: ``` File …

SalvatoreRa updated 1 day ago
4
openml-labs/ai_search #6

Get data that we can use to compute our evaluation metrics

We need data that we can use to evaluate our models according to some evaluation metric (#5) during initial development. This will most likely be some form of (query, relevant results) pairs. These…

PGijsbers updated 1 week ago
2
explodinggradients/ragas #1076

Too many request with llamaindex integration

[ x] I have checked the [documentation](https://docs.ragas.io/) and related resources and couldn't resolve my bug. **Describe the bug** missing run_config arguments in evaluate function in module …

nicho2 updated 2 weeks ago
1
tamlhp/awesome-machine-unlearning #46

New paper(s) on machine unlearning

1. mismatched machine unlearning Title: Decoupling the Class Label and the Target Concept in Machine Unlearning arXiv: https://arxiv.org/abs/2406.08288 2. evaluation of LLM unlearning Title: Unl…

ZFancy updated 2 weeks ago
1

上一页 1...2 3 4 5 6 7 8...100 下一页

1000+ results for evaluate-llm

1000+ results
for evaluate-llm