Missing labels for tasks

Hello, I'm interested in evaluating custom models on your benchmark. It would be helpful to have a file that maps each task to the corresponding phenomena. I've tried to extract this information from the leaderboard file, but it appears not all datasets are contained in this file. Specifically, after evaluating Flash Holmes, I'm searching for each task in the leaderboard file to find the related phenomena. By doing this, I still miss 25 tasks.

List of missing tasks

xpos  missing
zorro-irregular-verb  missing
gum-rst-cut-edu-relation-group  missing
ewok-social-relations  missing
gum-rst-cut-edu-relation  missing
ewok-agent-properties  missing
ewok-quantitative-properties  missing
gum-rst-cut-edu-type  missing
ewok-material-dynamics  missing
ewok-material-properties  missing
upos  missing
gum-rst-cut-edu-depth  missing
ewok-physical-dynamics  missing
fuse-negation  missing
ewok-physical-interactions  missing
zorro-agreement_determiner_noun-across_1_adjective  missing
ewok-social-properties  missing
bioscope-negation  missing
gum-rst-cut-edu-distance  missing
gum-rst-cut-edu-successively  missing
speculation  missing
ewok-social-interactions  missing
ewok-physical-relations  missing
ewok-spatial-relations  missing
gum-rst-cut-edu-count  missing

It would be helpful to be able to replicate the analysis you perform in the explorer (https://holmes-explorer.streamlit.app/) with a custom result file. Thank you in advance! 😊

Holmes-Benchmark / holmes-evaluation

Missing labels for tasks #9