Closed kai-car closed 1 month ago
Thank your for the PR! The loader script that needs to be adapted is the one under hub_repos
though.
Please take a look at the contribution guide, where you can also find how to format the code and execute tests (currently, the test output doesn't reflect your changes).
Hi, thanks for the feedback, I adjusted accordingly. This time, I properly followed the steps and now it should work. 👍
Thank you for your changes!
I'm getting the following error running the unit tests:
AssertionError: Dataloader attribute 'Creative Commons Attribution 4.0 International' not valid for _LICENSE must be one of {'GPL_2p0_WITH_BISON_EXCEPTION', 'PDDL_1p0', ...}
It's not related to your fix, but could you please add the correct license key in your PR? I guess it should be CC_BY_4p0
Also, I still see some differences when running black, did you run the formatting (https://github.com/bigscience-workshop/biomedical/blob/main/CONTRIBUTING.md#5-format-your-code)?
Hope the code adjustments fix the problems. :)
Current version of
drugprot.py
only includes splits train and validation. For this reason, I adjusted thedrugprot.py
data loading script to also load thetest_background
split, as the .tsv files are already present in the data folder. Note that thetest_background
split does not have any relations.See also HuggingFace pull request: https://huggingface.co/datasets/bigbio/drugprot/discussions/1/files