Closed patrickamadeus closed 1 month ago
Hi @elyanah-aco ! I've addressed all of the suggestions! Appreciate the detailed review.
I will address suggestion from @akhdanfadh after your second opinion 🙏.
Also adding to this, do we really want to not match the English text and Vietnamese translation together? I know that the dataset viewer in the homepage shows the data in a stack, but I think for a dataloader, we should add them together. Wdyt @elyanah-aco?
Hi @elyanah-aco ! I've addressed all of the suggestions! Appreciate the detailed review.
I will address suggestion from @akhdanfadh after your second opinion 🙏.
A friendly reminder for @elyanah-aco in case she missed it.
Hi all @akhdanfadh @elyanah-aco ! The minor language expand is done! Thank you for all of the reviews. 🙏
Hi @akhdanfadh, I would like to let you know that we plan to finalize the calculation of the open contributions (e.g., dataloader implementations) in 31 hours, so it'd be great if we could wrap up the reviewing and merge this PR before then.
cc: @patrickamadeus
Closes #623
Checkbox
seacrowd/sea_datasets/{my_dataset}/{my_dataset}.py
(please use only lowercase and underscore for dataset folder naming, as mentioned in dataset issue) and its__init__.py
within{my_dataset}
folder._CITATION
,_DATASETNAME
,_DESCRIPTION
,_HOMEPAGE
,_LICENSE
,_LOCAL
,_URLs
,_SUPPORTED_TASKS
,_SOURCE_VERSION
, and_SEACROWD_VERSION
variables._info()
,_split_generators()
and_generate_examples()
in dataloader script.BUILDER_CONFIGS
class attribute is a list with at least oneSEACrowdConfig
for the source schema and one for a seacrowd schema.datasets.load_dataset
function.python -m tests.test_seacrowd seacrowd/sea_datasets/<my_dataset>/<my_dataset>.py
orpython -m tests.test_seacrowd seacrowd/sea_datasets/<my_dataset>/<my_dataset>.py --subset_id {subset_name_without_source_or_seacrowd_suffix}
.Tests