UniversalDependencies / UD_Breton-KEB

Other
1 stars 2 forks source link

Translation metadata format #22

Open LoicGrobol opened 1 year ago

LoicGrobol commented 1 year ago

Currently the translations are formatted this way

# text = N'int ket aet war-raok.
# text[eng] = They didn't progress.
# text[fra] = Ils n'ont pas progressé.

I don't believe it has been standardized, but the doc would suggest to rather format the metadata keys this way

# text = N'int ket aet war-raok.
# text_en = They didn't progress.
# text_fr = Ils n'ont pas progressé.
ftyers commented 1 year ago

It hasn't breen standardised, but all the treebanks I have produced are done like that. Using ISO-639-3 codes and square brackets. It's clearer (imo), and follows how we do multilayer features.

dan-zeman commented 1 year ago

but the doc would suggest

It was put in the documentation simply to increase chances that it will be done similarly in multiple treebanks. All the treebanks I have produced are done like that :-) But it is not a guideline approved by the core group or checked by the validator.

bguil commented 1 year ago

Indeed, there is a zoo of 357 different metadata used in 2.11 treebanks: see http://tables.grew.fr/?data=meta/META.