IndoNLP / nusa-crowd

A collaborative project to collect datasets in Indonesian languages.
Apache License 2.0
261 stars 61 forks source link

Add the MORPHOLOGICAL_INFLECTION task and the pairs_multi schema #310

Closed holylovenia closed 1 year ago

holylovenia commented 1 year ago

For #108 and #44.

fhudi commented 1 year ago

@holylovenia @bryanwilie Thanks for the new schema, just a quick note. I think this schema does not support context field for disambiguating cases that have same inflected form but from different inflection. CMIIW.

holylovenia commented 1 year ago

@fhudi I see. Since I'm not familiar with morphological inflection, could you please provide me with an example? Your suggestion will be very helpful to adjust this new schema.

fhudi commented 1 year ago

@holylovenia I am not familar as well hence the CMIIW. AFAIK this schema covers majority of the cases for Indonesian and Indonesia's indigenous languages. I can't give example at the moment but will let you know if I ever bump into such cases.

holylovenia commented 1 year ago

@fhudi Okay, then probably we can just use the schema as-is first. We can modify it if we ever bump into those unhandled cases. Thanks for the heads-up and the minor fix!