-
Hey @andywang-25 - really great work!
We analyzed the results and they look really impressive - we would like to fine-tune other open-source models as well as try to benchmark some alternative approa…
-
The batched `operator()` in SplineEvaluator takes the coordinates for the evaluation as argument:
https://github.com/CExA-project/ddc/blob/98c1fc3cc57d24dc2187f7009df8da7e0e6ba9dd/include/ddc/kernels…
-
Very wonderful work.
I notice that swe-bench evaluation requires files including
```
eval.sh: The evaluation script
patch.diff: The model's generated prediction
report.json: Summary of evaluatio…
-
Hi, did you use the code contained in this [repo](https://github.com/babylm/evaluation-pipeline/tree/main) ?
If not, would you mind providing one? especially for `(Super)GLUE` and `MSGS (MCC)`
-
Hi there!
Really nice work. I just wanted to ask if it is possible that you provide the evaluation code and settings?
Thanks in advance for your help!
-
How to output datasets with evaluation accuracy of P, R, mAP, FPS, Gflops
-
Hello,
After running the logbatcher_eval.py script with the command:
```
python logbatcher_eval.py --config logbatcher_2k
```
I observed that the evaluation output provides metrics such as Grou…
-
Hi, thank you for your work!
I’m particularly interested in the evaluation metrics mentioned in the paper. I was wondering if there is any available code or scripts to reproduce the evaluation metr…
-
The evaluation code for calculating FID and YOLO-score(AP) is not provided. Can you publish the code on github?
-
Hi,
Thanks for the nice work! I am interested in reproducing the results, but the evaluation procedure in the repo is not very clear. It seems that we could not directly use the linked ST-P3 eval …