bigscience-workshop / biomedical

Tools for curating biomedical training data for large-scale language modeling
459 stars 115 forks source link

Create dataset loader for PharmaCoNER #138

Closed jason-fries closed 2 years ago

jason-fries commented 2 years ago

Adding a Dataset

mapama247 commented 2 years ago

self-assign

hakunanatasha commented 2 years ago

Hi @mapama247, can you let us know if you are still working on this so we can update our project board? Please just notify us the status by Friday April 8, no worries if you are not finished but intend to work on this. Please either ping me here at @hakunanatasha or ping the discord admins (with @admins)

mapama247 commented 2 years ago

Hi @hakunanatasha! Yes, I have not finished yet but I plan to do it before the deadline. I already had a script that could load the dataset from local conll files, but on the Discord channel I was told that it is preferable to have a data loader that directly downloads and converts the brat files from the official site... so I will need some extra time to change this :)

hakunanatasha commented 2 years ago

@mapama247 no worries, feel free to ping me if you need help with this change!