Open Miking98 opened 1 year ago
@Miking98 thanks for this contribution! we are in the middle of updating our contribution guidelines to support hub datasets. Can I ask that we hold off on merging this until the new guidelines are published and/or can you update your PR to include an implementation in the hub_repos
directory?
this is the PR that has the new contribution guidelines https://github.com/bigscience-workshop/biomedical/pull/850
and this is an example of a PR contributing code to the hub_repos
directory (but it wont be easily testable until the PR above is merged) https://github.com/bigscience-workshop/biomedical/pull/852
Thanks for the note @galtay, makes sense! Will hold off until the new guidelines are published in that case, then will revise and submit a new pull request once updated to abide by them. Thanks!
hello @Miking98 thanks for your patience! we have a new CONTRIBUTING.md
file now (https://github.com/bigscience-workshop/biomedical/blob/main/CONTRIBUTING.md) and I was wondering if you'd help us try it out. Please ping me if there are any issues and I'll help get this dataset loader in.
Thanks for the note @galtay ! I just went through the revised Contributing doc and updated my pull request accordingly -- please let me know your thoughts
Add the Paragraph-Level Simplification of Medical Texts dataset. Closes #854
Checkbox
biodatasets/my_dataset/my_dataset.py
(please use only lowercase and underscore for dataset naming)._CITATION
,_DATASETNAME
,_DESCRIPTION
,_HOMEPAGE
,_LICENSE
,_URLs
,_SUPPORTED_TASKS
,_SOURCE_VERSION
, and_BIGBIO_VERSION
variables._info()
,_split_generators()
and_generate_examples()
in dataloader script.BUILDER_CONFIGS
class attribute is a list with at least oneBigBioConfig
for the source schema and one for a bigbio schema.datasets.load_dataset
function.python -m tests.test_bigbio biodatasets/my_dataset/my_dataset.py
.