-
[X] I checked the [documentation](https://docs.ragas.io/) and related resources and couldn't find an answer to my question.
**Your Question**
what is unclear to you? What would you like to know?
…
-
PandaLM: An Automatic Evaluation Benchmark for LLM Instruction Tuning Optimization. PandaLM is the first to evaluate llm using a finetuned llm.
-
As mentioned in the paper - "Furthermore, we also invite some expert annotators to label task planning for some complex requests (46 examples) as a high-quality human annotated dataset. We also plan t…
-
Just one issue to track progress till the next release and collaborate on creating the right issues. Feel free to edit this issue/comment on changes.
Scope of Next Release:
- [ ] For each paper cr…
-
### Current Behavior
When following the LangChain instructions from the docs for a custom LLM I'm getting:
```
File "gptcache/processor/pre.py", line 20, in last_content
return data.get("m…
-
### Describe the issue
First of all, thank you for your great contributions.
I have a similar question to the [issue 146](https://github.com/microsoft/LLMLingua/issues/146), I cannot reproduce the…
-
### Checklist
- [x] 1. I have searched related issues but cannot get the expected help.
- [x] 2. The bug has not been fixed in the latest version.
- [x] 3. Please note that if the bug-related issue y…
-
Hi, thanks for your nice work. I have a question about reproducing the driving score shown in the paper. I run the evaluation with the following configurations:
```
preception_model = 'memfuser_…
-
I encountered an issue while evaluating a dataset using the ragas library with the Langchain LLM and Sentence Transformer embeddings. The process throws an exception during execution.
**Steps to Repr…
-
Today, the [Python Evaluation building block](https://aka.ms/azai/eval) can be used against a .NET backend that uses the Chat Protocol (Azure Search supports this). However, we know from customer feed…