-
Hello! Hopefully this is the right place for this, if not please let me know and I'll continue my search elsewhere!
I've been trying for the past few months to get some form of Linux installed on m…
-
### Is there an existing issue for the same bug?
- [X] I have checked the troubleshooting document at https://opendevin.github.io/OpenDevin/modules/usage/troubleshooting
- [X] I have checked the exis…
-
## Issue encountered
While setting up the framework to evaluate using LLM-as-judge, it would be helpful to test end-to-end without special permissions like setting up openai_key or HF pro subscriptio…
-
### Description of the bug
This line stores the reward prompt from the instance member -- `evaluator.prompt` which is updated in each `__acall__`. This is a dangerous operation since the prompt is lo…
-
Pose your questions as Issue Comments (below) for [James Evans](https://sociology.uchicago.edu/directory/James-A-Evans) regarding his 10/3 talk on *Simulating Subjects: The Promise and Peril of AI Sta…
-
In the leaderboard, for the GPT-4 evaluated section, why is the sum of n_wins and n_draws not equal for each row? What evaluation method is used in the leaderboard? Is it 181 questions?
-
This week, @amaatouq reached out with an interesting idea, which is that we can potentially train a pipeline for using GPT to rate tasks (and even test to see if GPT can replicate our raters' mapping …
xehu updated
2 months ago
-
### Ticket Contents
## Description
Overview:
This feature aims to enhance the existing Reap Benefit Solve Ninja Mentor WhatsApp chatbot on Glific by integrating a Virtual Mentorship system. The g…
-
![image](https://github.com/user-attachments/assets/175db829-6823-4db3-8359-28f778bce061)
如图,之前提交过的一个任务,上周查看结果正常,但是现在查看显示ERROR,分数还存在,是有什么bug吗?
-
**Bug Description**
What happened?
1)I tried to use other frameworks such as ragas and trulens to calculate context_relevance for my data sets, but the two frameworks gave different results.Is it be…