-
Thank you for your insightful work and contribution. I just have a few questions regarding the work in the paper and the code implementation:
1. I understand that we have used the training data or the…
-
### User story
As a challenge manager, in order to manage evaluation of my challenge submissions efficiently, I would like to be able to unassign an evaluator from a submission (due to availability, …
-
### User story
As a challenge manager, in order to have the submissions of my challenge reviewed and scored by evaluators, I would like evaluators to be available within the system for assignment to …
-
Hello
is it possible to add the true_solution in the stokes problem?
and would it be possible to visualise the residual in 2d? Similarly to poisson?
or else, what visualisation do you suggest for…
-
- [ ] Questionnaires de satisfaction opur chaque chapitre (wooclap ou moodle ?)
- [ ] Dépôt github: public: utiliser ? qu'y mettre ?
- [ ] matériel de TP récurrent: séquences, exemples, petits …
-
i am not so expert at AI/ML yet, so a better and educated opinion is needed here :) the following are just from our brainstorming at the moment.
- is [ML search](https://opensearch.org/docs/latest/…
-
Issue is to track evaluation of RAG implementations.
Frameworks:
- F
Papers:
- F
- F
One-Offs:
- https://github.com/microsoft/promptflow/tree/main/examples/flows/evaluation/eval-qna-rag…
-
Thanks for your great work!
There are some questions for evalute result:
1. In paper's all tables, it seems that the best result are used to compare with other works. But in test case, you use …
-
We would like to ask about the output of the evaluation code.
As written in Github instruction, we run
$ bash scripts/eval_estimate.sh
It seems there are no image outputs and we are wondering i…
-
Hi! Thanks for all your work to make HELM available.
I had a few questions about the RAFT evaluations.
1. What set of examples are the LLMs evaluated over? I noticed this comment in the Raft Sc…