-
### Issue Type
Bug
### Source
source
### Giskard Library Version
2.8.0
### Giskard Hub Version
NaN
### OS Platform and Distribution
Ubuntu 22.04.4 LTS
### Python version
…
-
**Describe the bug**
I am running 4 inquiries to get llm response by asyncio.gather(). And output the response into a dataset for RAGAS evaluation.
But the results are all the same from Context.Re…
-
-
## Overall workflow
**TO BE DEFINED**
## Evaluation Types
### RAG Evaluation
@Lanture1064 @bjwswang
Our current RAG solution flow :
1. Dataset/VersionedDataset provides source file…
-
### System Info
Running TGI docker with command
`docker run --rm --gpus all --ipc=host -p 8080:80 -v /root/.cache/huggingface/hub:/data -e HF_API_TOKEN=hf_XXXX ghcr.io/huggingface/text-generatio…
-
### Bug Description
My basic idea is to build an automated LLM evaluation program using llama index and Trulens. The LLM I used is Chat GLM, which has the same streaming API calling method as Chat GP…
-
Hi all,
Although `BIRD` has incurred significant annotation costs, we still cannot guarantee that all the data is accurately labeled. **_We hope that the community can assist us in building BIRD t…
-
**Your Question**
what model is used by ragas for answer relevancy score calculation ?
I didn't mentioned any model name or inference API till ragas generating an evaluation report ?
**Additio…
-
I am trying out DSPy 2.3.6 with a simple transcript summarization example:
```python
class Summarizer(dspy.Module):
def __init__(self):
super().__init__()
self.summarizer = …
-
I am currently using the code provided in this [Colab notebook](https://colab.research.google.com/github/truera/trulens/blob/main/trulens_eval/examples/quickstart/quickstart.ipynb#scrollTo=-HyRuVA2qR7…