bigscience-workshop / biomedical

Tools for curating biomedical training data for large-scale language modeling
455 stars 116 forks source link

Create a dataset loader for CLEF eHealth 2019, Task 1 #68

Open hakunanatasha opened 2 years ago

hakunanatasha commented 2 years ago

From https://www.openagrar.de/receive/openagrar_mods_00046540?lang=en

cakiki commented 2 years ago

self-assign

cakiki commented 2 years ago

Should this also include the corresponding test set: https://www.openagrar.de/receive/openagrar_mods_00049062

hakunanatasha commented 2 years ago

@jason-fries @leonweber @galtay not sure! i'm pinging the other admins

jason-fries commented 2 years ago

Hi @cakiki I would include the test set. Looks like you'll just have to pull from both repositories in your URL downloads when defining splits. Let us know if encounter any problems!

tanmaylaud commented 2 years ago

self-assign