evaluate-llm Search Results

1000+ results
for evaluate-llm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

Striveworks/valor #743

Handle bad llm response with retries (llm-guided metrics)

With the initial text generation metric PR, if an LLM provides an invalid response for one of our LLM guided metrics (wrong formatted, wrong data type, etc.), then Valor will raise an error and the re…

bnativi updated 1 month ago
1
mosaicml/llm-foundry #1541

How to evaluate model using multi-gpu?

When using llm-foundry for model evaluation, multi-gpu mode does not work. The source code is here: https://github.com/mlfoundations/open_lm/blob/main/eval/eval_openlm_ckpt.py

lqniunjunlper updated 6 days ago
1
Kacper-W-Kozdon/promptflow_unify_integration #5

Some Suggestions for the Unify AI Tool YAMLs 😊

Lastly i was looking at the `YAML` files for the **UnifyAI** tools, and I had a few ideas that might help: **Here are my thoughts 🤔**: - `evaluate_llm_tool.yaml:` It seems like the `prompt…

KatoStevenMubiru updated 1 month ago
4
mlflow/mlflow-website #77

Evaluate LLMs with custom metrics with LLM as a judge

## Summary This template is intended to capture a few base requirements that are needed to be met prior to filing a PR that contains a new blog post submission. Please fill out this form in its…

iRahulPandey updated 3 months ago
2
explodinggradients/ragas #1245

Ragas using llamaindex

Evaluation failed: 'CustomOllama' object has no attribute 'set_run_config', what is the solution, Ragas Version: 0.1.7 **Code Examples** # Define a simple dataset using Pandas DataFrame data…

Senthselvi updated 1 month ago
7
explodinggradients/ragas #1226

Local Model Runner in Executor raises exceptions

**Describe the bug** I want to use local llms to evaluate my rag app, I have tried Ollama and HuggingFace models but neither of them is working. Ragas version: 0.1.11 Python version: 3.11.3 **…

g-hano updated 1 month ago
5
elie222/inbox-zero #229

PDF Data Extraction

## Description To enhance Inbox Zero's capability in handling PDF documents, particularly receipts and potentially more complex documents like pitch decks, we need to research and implement effective…

elie222 updated 1 week ago
1
run-llama/llama_index #16166

[Feature Request]: Support candidate generations

### Feature Description The most popular LLMs such as OpenAI support candidate generations which means to generate n responses for the same prompt. This feature can be used in RAG, evaluations and mo…

TupleType updated 1 week ago
1
haesleinhuepf/human-eval-bia #77

Evaluating LLMs capabilities relative to task-complexity

I've been working on evaluating how well LLMs can handle bioimaging tasks relative to the complexity of the task. First, we can see that different tasks have different probabilities of being easily…

ian-coccimiglio updated 2 months ago
7
explodinggradients/ragas #1246

Can you ragas code with custom ollama and custom embedings

I need code with llamaindex using bearer token and base url not with langchain. from langchain_community.vectorstores import FAISS from langchain_community.vectorstores import Chroma from langcha…

Senthselvi updated 1 month ago
2

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for evaluate-llm

1000+ results
for evaluate-llm