IndoNLP / nusa-crowd

A collaborative project to collect datasets in Indonesian languages.
Apache License 2.0
261 stars 61 forks source link

Close #130 | Create dataset loader for ID_Quora_Paraphrasing dataset #200

Closed yana-xuyan closed 2 years ago

yana-xuyan commented 2 years ago

Please name your PR after the issue it closes. You can use the following line: "Closes #ISSUE-NUMBER" where you replace the ISSUE-NUMBER with the one corresponding to your dataset.

Checkbox

yana-xuyan commented 2 years ago

Btw, the citation is not available. Please let me know if it has to be fixed.

bryanwilie commented 2 years ago

by the way, is this id_qqp taken from QQP?

if yes, maybe we can use this citation too:

@misc{chen2018quora,
  title={Quora question pairs},
  author={Chen, Zihan and Zhang, Hongbo and Zhang, Xiaoji and Zhao, Leqi},
  year={2018}
}
yana-xuyan commented 2 years ago

Sure! I'll address these issues XD

bryanwilie commented 2 years ago

Thank you @yana-xuyan @SamuelCahyawijaya .

Thank you for the fix. One more thing though, could you also attach the DD-MM-YYYY in the citation maybe as per the browsing date? Thanks!

bryanwilie commented 2 years ago

Looks good to me! Thanks @yana-xuyan again for contributing!