bigscience-workshop / biomedical

Tools for curating biomedical training data for large-scale language modeling
447 stars 114 forks source link

Update unit tests + contribution guidelines to support HFhub submissions #856

Closed hakunanatasha closed 1 year ago

hakunanatasha commented 1 year ago

Supercedes #850

TLDR:

How to run:

1) There is a test folder here: bigbio/hub/hub_repos/test_scitail. This includes a dataloading script and a copy of the bigbiohub.

2) You can test the unit tests in the main bigbio directory (parent of the tests folder) by running the following command:

python -m tests.test_bigbio_hub test_scitail --test_local

This will pass, but you can test the metadata works as intended by modifying the _LANGUAGES and _LICENSES category. I made 2 examples

NOTES

galtay commented 1 year ago

@hakunanatasha I added a few small changes