-
**What would you like to be added/modified**:
A benchmark suite for multimodal large language models deployed at the edge using KubeEdge-Ianvs:
1. Modify and adapt the existing edge-cloud data c…
-
### 🚀 The feature
Request from potential user: "There are two main aspects, 1) adjusting prompts that changing semantic words does not trigger hallucination, 2) the prompt itself is such that LLM doe…
-
Looking at the RAGAs sample notebook ...
[https://github.com/langchain-ai/langsmith-cookbook/blob/main/testing-examples/ragas/ragas.ipynb](https://github.com/langchain-ai/langsmith-cookbook/blob/ma…
-
### 是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
- [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions
### 该问题是否在FAQ中有解答? | Is there an existing ans…
-
### Describe the issue
Hello, I am trying to use Autogen for this multiagent healthcare system. The code looks like this:
config_list = [
{
"model": "gpt-3.5-turbo-16k",
…
-
The metrics-based approaches in the `QAAccuracy` eval algorithm seem to harshly penalize verbose models (like Claude) on datasets with concise reference answers (like SQuAD).
It'd be useful if this…
-
## Summary
[MKQA: A Linguistically Diverse Benchmark for Multilingual Open Domain Question Answering](https://aclanthology.org/2021.tacl-1.82.pdf)
> _"MKQA contains 10,000 queries sampled from t…
-
**Describe the bug**
A clear and concise description of what the bug is.
I tried using RAGAS with a model that is not OpenAI. In general whatever model I use I get this error back:
```
File …
-
Hi,
Are you planning making textgrad llm calls asynchronous?
I tried to start adding saynchronous methods to make at least evaluation calls and inference (everything that is forward) asynchrono…
ajms updated
2 months ago
-
We need to start dividing up the work, authoring the various Threats / Controls. We'll use this issue to manage that work and their assignments.
Each threat / control is 'ticked' when assigned to …