openlanguagedata / flores

The FLORES+ Machine Translation Benchmark
Creative Commons Attribution Share Alike 4.0 International
84 stars 14 forks source link

North/South Levantine merged in ISO 639-3 #7

Open laurieburchell opened 5 months ago

laurieburchell commented 5 months ago

In the latest version of the ISO 639-3 standard, the code for South Levantine Arabic (ajp) has been deprecated and merged into North Levantine Arabic (apc) so there is now just one code (apc) covering what used to be considered two dialects. FLORES+ has two test sets for North Levantine and South Levantine which are not identical, but it is not clear to me if these are actually different dialects.

Should these languages be merged in FLORES or are they sufficiently distinct?