Add implementation of ChemDisGene data set

mariosaenger commented 1 month ago

Closes #917

mariosaenger commented 1 month ago

Thanks for checking the implementation. There are several aspects to keep in mind. First the dataset consists of a curated and a non-curated part. This implementation only concerns the former one. Second, the data set annotates relations only on abstract-level (using knowledge base identifiers). Following default practices in BigBio, I unrolled the document-level relations to mention-level. Note, however, the document-level annotations are available in the source schema. These aspects complicate a direct comparison of the numbers :-/

leonweber commented 1 month ago

Ah, thanks for pointing this out. Then let's merge this : )

bigscience-workshop / biomedical

Add implementation of ChemDisGene data set #918