bigscience-workshop / biomedical

Tools for curating biomedical training data for large-scale language modeling
439 stars 111 forks source link

Closes #900 #901

Closed GullyBurns closed 8 months ago

GullyBurns commented 9 months ago

Name: CZI Disease Research State Model Data: https://github.com/chanzuckerberg/DRSM-corpus/ License: CC0

All data processing elements for this dataset are completed. This PR makes some edits for the README file.

GullyBurns commented 8 months ago

Just fixed some conflicts. Hopefully this is OK now.