-
Several information retrieval "tasks" use a few common evaluation metrics including mean average precision (MAP) [1] and recall@k, in addition to what is already supported (e.g. ERR, nDCG, MRR). Somet…
-
Thanks for your contributions again!
Could you please provide evaluation codes (Table 1 to Table 3) for followers to better reproduce your methods?
-
The tests so far are tightly coupled to the implementation rather than the interface. Changes to the internals often cause tests to fail because of small changes to timestamps or chunking (e.g. #5). T…
-
Many congratulations on such an excellent article, I had some problems reproducing the results. I have experimented with the training weights you posted on the jumping dataset, but the final result is…
-
Hello! I'm a big fan of this library and have been using it in my research for over a year now. At this point, I have a fork with several new methods and some new features. I want to fold my changes b…
-
It would useful to collect system metrics, e.g. latency, during the evaluation and to provide a summary in the evaluation output.
acere updated
2 months ago
-
![ambiguity_L_FI_score_detection](https://github.com/user-attachments/assets/007b8eae-2db5-4495-846b-e3f10de89326)
Dear @vpchung, @rachitsaluja,
First of all, thank you for sharing lesion-wise e…
-
![image](https://github.com/stanford-crfm/helm/assets/8592144/2e36b93b-e29d-45f9-9a69-75d40ab56f08)
![image](https://github.com/stanford-crfm/helm/assets/8592144/8bc19f3c-ab9b-480a-8e1e-a0cbf702d97b)…
-
Hi @Lucaweihs , @mattdeitke!
Can you please tell if there are existing any evaluation metrics per class or type of objects (pickable/openable..)?
Best regards,
Mariia
-
Thanks for your great work and kind sharing.
I am a beginner in defect detection, but intend to do some work on Magnetic Tile dataset.
I successfully reproduced your inspiring work, and get th…