bigscience-workshop / biomedical

Tools for curating biomedical training data for large-scale language modeling
459 stars 115 forks source link

Create dataset loader for n2c2 2014 - Deidentification & Heart Disease #220

Closed jason-fries closed 1 year ago

jason-fries commented 2 years ago

Adding a Dataset

jdposada commented 2 years ago

self-assign

jason-fries commented 2 years ago

Hi @jdposada can you let us know if you are still working on this so we can update our project board? Please just notify us the status by Friday April 8. You can response to this comment or ping us on Slack or Discord.

No worries if you are not finished but still intend to work on this!

jdposada commented 2 years ago

hi @jason-fries ,

I am still working on this. Actually working with someone else. How can I give him credit as well?

hakunanatasha commented 2 years ago

@jdposada as mentioned in a different issue with @jason-fries you can opt to either have both your names on the script as co-authors or ideally, your two commit histories tied to two separate GitHub accounts. We are fine with either option.

jdposada commented 2 years ago

New PR linked only for the deid task

https://github.com/bigscience-workshop/biomedical/pull/644

jdposada commented 2 years ago

New PR linked for the risk factors task

https://github.com/bigscience-workshop/biomedical/pull/646

hakunanatasha commented 1 year ago

merged in datasets