Closed akhdanfadh closed 1 month ago
this dataloader hasn't passed
python -m tests.test_seacrowd seacrowd/sea_datasets/asr_ibsc/asr_ibsc.py
yet. Would you mind fixing it first until it can pass the unit test?
It is working in mine w/o any changes. Could you share your output?
Well, I don't know why those comments result in an error on your end, but not on mine. I've uncommented the line there.
@holylovenia @sabilmakbar @faridlazuarda
A friendly reminder for @akhdanfadh to check on this PR. 👀
in case you missed it, it seems that this dataset has a test split aside from the train split. In case you didn't, is there a reason to exclude the test split?
@holylovenia there is only the train set in HF datacard and that's why. Since I think that the author in HF is not the original author, I will reimplement the dataloader with the github version. Let me work on this this weekend.
Probably better to remove the HF URL in our datasheet, no?
Done @holylovenia @faridlazuarda . Please re-review cause this one is a totally different implementation based on the GitHub data (instead HF).
Closes #440
Checkbox
seacrowd/sea_datasets/my_dataset/my_dataset.py
(please use only lowercase and underscore for dataset naming)._CITATION
,_DATASETNAME
,_DESCRIPTION
,_HOMEPAGE
,_LICENSE
,_URLs
,_SUPPORTED_TASKS
,_SOURCE_VERSION
, and_SEACROWD_VERSION
variables._info()
,_split_generators()
and_generate_examples()
in dataloader script.BUILDER_CONFIGS
class attribute is a list with at least oneSEACrowdConfig
for the source schema and one for a seacrowd schema.datasets.load_dataset
function.python -m tests.test_seacrowd seacrowd/sea_datasets/<my_dataset>/<my_dataset>.py
.