-
## Background
We are curious to know whether ontology score correlates to performance on downstream tasks.
We could evaluate performance on downstream tasks ourselves, but as a first approximation,…
-
In attempting to follow the setup in the README, am able to successfully call:
```
poetry poe local-infrastructure-up
```
Can then access the ZenML dashboard. However, none of the pipelines s…
-
Beyond LLM supports, 4 evaluation metrics: Context relevancy, Answer relevancy, Groundedness, and Ground truth.
We would be looking forward to add new evaluation metric support to evaluate LLM/RAG…
-
[ ] I have checked the [documentation](https://docs.ragas.io/) and related resources and couldn't resolve my bug.
**Describe the bug**
Message 'o statements were generated from the answer' was sen…
-
Moving on to the **benchmark_models** tool in `evaluate_llm_tool.py,` and I had a few suggestions that might help improve it:
- Docstring Accuracy:
The docstring currently mentions a prompt_set…
-
We currently leverage some llm based evaluation metrics from ragas: https://github.com/explodinggradients/ragas
namely, `llm_context_precision`, `llm_context_recall` and `llm_answer_relevance` in thi…
-
### Bug Description
On llama-index 0.11.22 and llama-index-finetuning 0.2.1. I was attempting to follow the documentation to finetune the BAAI/bge-small-en-v1.5 model on my own dataset. I attempted…
-
1. Performance metrics
2. Reliability measures
3. Areas for improvement
4. Test suite setup - Updated the queriesandresponses.json file in the repo.
-
## Issue encountered
It would be good to have a system for evaluating both the relevance of the RAG and its use by the LLM in producing the response. My first intuition would be a multi-stage system …
-
## Summary
This template is intended to capture a few base requirements that are needed to be met prior to filing a PR that contains a new blog post submission.
Please fill out this form in its…