-
### User story
As a challenge manager, in order to ensure that a valid form is available for the evaluators to use, I would like a form to be validated for errors and valid logic before it is saved.
…
-
Thank you very much for your work @zorazrw !!!
Honestly, very impressive!
I see that you're using auto eval and it uses gpt-3.5
My questions are:
1. Why did you use auto eval instead of taking t…
-
I noticed that mmscan authors has not released the entire dataset but a beta-version in 2024.9. Is the result in LLAVA-3d on MMScan benchmark evaluated on a partial version like beta-version, or do yo…
-
Here are some thoughts concerning fairness evaluation:
### Protected attributes
*Extraction*: As methodologies may implement pre-, in- or post-processing to enhance algorithmic fairness, I think w…
-
Could the author provide the reference batch to evaluate the model on CIFAR?
-
### Describe the bug
When evaluating the model both centrally in the server and federated in the clients, after server finishes evaluating the model for the current round and sends the `evaluate mess…
-
Hi, thank you for your great work.
In VQAttack/ALBEF_VQAttack/ALBEF_attack/refTools/evaluation, it seems that the "tokenizer" folder is missing. So I got a ERROR saying "No module named 'refTools.…
-
Hi,
When I use my `output.json` and the repo's `test_eval.json`, it worked the first two times. However, now I see ChatGPT replies such as:
> I would rate your answer as 10.
which leads to the…
-
### Bug Description
guideline evaluation is throwing error saying missing model
### Version
latest
### Steps to Reproduce
run the guideline evaluation example
### Relevant Logs/Tracbacks
```sh…
-
Hi Zechun,
great work!
Could you please share the details about the evaluation code? like which codebase was used to run inference etc.
thank you,
Kalyani