Closed raissinging closed 1 month ago
Hi, thanks for contributing this to BigBio!
It seems like the number of documents reported by the unit tests do not match those reported in the source paper. Is this expected?
train ========== id: 38 document_id: 38 text: 38 labels: 289394 test ========== id: 11 document_id: 11 text: 11 labels: 79946 validation ========== id: 6 document_id: 6 text: 6 labels: 50609
Hi! Thank you for letting me know. I originally only had the 55 full text papers using a bigbio schema, but I just added the 1,195 paper abstracts we have as well! Sorry about that!
closes #919
Checkbox
hub/hub_repos/my_dataset/my_dataset.py
(please use only lowercase and underscore for dataset naming)._CITATION
,_DATASETNAME
,_DESCRIPTION
,_HOMEPAGE
,_LICENSE
,_URLs
,_SUPPORTED_TASKS
,_SOURCE_VERSION
, and_BIGBIO_VERSION
variables._info()
,_split_generators()
and_generate_examples()
in dataloader script.BUILDER_CONFIGS
class attribute is a list with at least oneBigBioConfig
for the source schema and one for a bigbio schema.datasets.load_dataset
function.python -m tests.test_bigbio_hub <dataset_name> [--data_dir /path/to/local/data] --test_local
.