Closed muhammadravi251001 closed 1 month ago
@muhammadravi251001 : Thank you for the update! LGTM!
Thanks for the approval, Sir!
A friendly reminder for @luckysusanto to review.
The code works well, but, I noticed that there are only two labels in the dataset: 0 and 2.
I checked the original homepage, and the owner did state that there are 3 labels: Entailment (0), neutral (1), and contradiction (2). However, the original dataset only contains two labels: either entailment or contradiction.
I think it would be better for us to turn "contradiction" into (1) [changed from (2)], and then put a comment/note on the file. I fear that currently, it might cause some confusion for users later on
cc: @holylovenia
It was done on purpose, Lucky. I've already made the explanation/clarification on this comment for the same task of my NLI dataset: https://github.com/SEACrowd/seacrowd-datahub/pull/633#issuecomment-2088094588
I see, in that case, approved!
I see, in that case, approved!
Alright, thanks for the approval, Lucky!
Title: Add Dataloader IDK-MRC-NLI
First line PR Message: Closes https://github.com/SEACrowd/seacrowd-datahub/issues/615
Notes
_CITATION
field, I will add it later.Checkbox
seacrowd/sea_datasets/{my_dataset}/{my_dataset}.py
(please use only lowercase and underscore for dataset folder naming, as mentioned in dataset issue) and its__init__.py
within{my_dataset}
folder._DATASETNAME
,_DESCRIPTION
,_HOMEPAGE
,_LICENSE
,_LOCAL
,_URLs
,_SUPPORTED_TASKS
,_SOURCE_VERSION
, and_SEACROWD_VERSION
variables._info()
,_split_generators()
and_generate_examples()
in dataloader script.BUILDER_CONFIGS
class attribute is a list with at least oneSEACrowdConfig
for the source schema and one for a seacrowd schema.datasets.load_dataset
function.python -m tests.test_seacrowd seacrowd/sea_datasets/<my_dataset>/<my_dataset>.py
orpython -m tests.test_seacrowd seacrowd/sea_datasets/<my_dataset>/<my_dataset>.py --subset_id {subset_name_without_source_or_seacrowd_suffix}
.