-
I am using RAGS evaluate method to do evaluation on 35 Test dataset with groundtruth. Its completing the evaluation and failing at the last step with this error. I have added "raise_exceptions=False" …
-
## Description
We need to choose a small number (1-3, depending on size) of open source RAG evaluation datasets. Having at least 1 open source dataset allows us to begin running basic evaluations tha…
-
When I evaluated the generated results after running the third step, the following error occurred:
Traceback (most recent call last):
File "/RAG-privacy/evaluation_results.py", line 767, in
…
-
### System Info
- `transformers` version: 4.41.2
- Platform: Linux-6.1.85+-x86_64-with-glibc2.35
- Python version: 3.10.12
- Huggingface_hub version: 0.23.2
- Safetensors version: 0.4.3
- Acce…
-
python llmuses/run.py
--model /media/ama/data1/wzh/Model_Param/LLM/Yi_6B/pretrain
--template-type chatglm3
--datasets arc
--dataset-hub Local
--dataset-dir /home/wzh/DianWang/RAG_project/Dat…
-
**Post processing**
-------
- Moderation - RAI: checking as harmful or unfavorable, ...
- Regas: grounding check for output answer by question, contexts, groundtruth, ...
https://zilliz.com/blog/r…
-
Testing it with a single small document splitter in 5 chuncks. Repeating the test with 2 documents. To see if that could be the issue. This could be related to issue below but I am not using adaptor. …
-
### Problem & Motivation
There is a huge wave of interest around high accuracy Q&A, such as via Retrieval Augmented Generation (RAG). RAG accuracy is largely driven by how well vector search is abl…
-
Hi Team,
Thanks for designing RAG Evaluation package this was a much needed thing in LLMs projects.
I wanted to know how can I create/convert my dataset into the format which I can use in Inspect…
-
**Describe the Feature**
OpenTelemetry is a de facto standard for reporting metrices and OpenLLMetry uses OpenTelemetry under the hood. It would be better to have an integration to send such metrices…