Closed patrickamadeus closed 5 months ago
wait I'm going to check it quickly, pardon for late response
Hi @patrickamadeus, I already put in an updated review. Let both of us know if the suggestion has been addressed, prob both me and LJ need to re-run the whole checking once more to ensure it's already correct since this data loader is quite complex. Thx!
Hi @patrickamadeus, all looks good to me. Since LJ said he doesn't have much PC storage left (presumably), I'll proceed with the merge :) (I am able to download all data & subsets and tested it too).
How does it sound, @ljvmiranda921? If that's fine from your end, I'll approve and merge it
^Yes please feel free to merge! 🙇
Closes #448
Checkbox
seacrowd/sea_datasets/{my_dataset}/{my_dataset}.py
(please use only lowercase and underscore for dataset folder naming, as mentioned in dataset issue) and its__init__.py
within{my_dataset}
folder._CITATION
,_DATASETNAME
,_DESCRIPTION
,_HOMEPAGE
,_LICENSE
,_LOCAL
,_URLs
,_SUPPORTED_TASKS
,_SOURCE_VERSION
, and_SEACROWD_VERSION
variables._info()
,_split_generators()
and_generate_examples()
in dataloader script.BUILDER_CONFIGS
class attribute is a list with at least oneSEACrowdConfig
for the source schema and one for a seacrowd schema.datasets.load_dataset
function.python -m tests.test_seacrowd seacrowd/sea_datasets/<my_dataset>/<my_dataset>.py
orpython -m tests.test_seacrowd seacrowd/sea_datasets/<my_dataset>/<my_dataset>.py --subset_id {subset_name_without_source_or_seacrowd_suffix}
.T2T
SPTEXT
SPTEXT_TRANS