apertium / apertium-apy

📦 Apertium HTTP Server in Python
https://wiki.apertium.org/wiki/Apertium-apy
GNU General Public License v3.0
32 stars 42 forks source link

orthographic modes detected as if translation modes, but do not work #223

Open jonorthwash opened 11 months ago

jonorthwash commented 11 months ago

For example, orthographic conversion modes between zab_Simp and zab_Phon are detected as translation pair modes:

$ curl 'https://beta.apertium.org:/apy/listPairs' ... {"sourceLanguage": "zab_Phon", "targetLanguage": "zab_Simp"}, {"sourceLanguage": "zab_Simp", "targetLanguage": "zab_Phon"}, ...

However, conversion between them doesn't work via APy:

$ curl 'https://beta.apertium.org:/apy/listP/apy/translate?langpair=zab_Simp|zab_Phon&q=dizh' {"status": "error", "code": 503, "message": "Service Unavailable", "explanation": "internal error"}

This does work in the command-line version of the mode:

$ echo "dizh" | apertium -f none -d . zab_Simp-zab_Phon dìiʼzh

Some modes between different locales may legitimately be more than just orthography; i.e., they should be thought of as full translation pairs (e.g., between American and British English). So I'm wondering whether, instead of fixing this, it would make sense to name the orthography-only modes -ortho or something and have a separate detection process for them, and serve them separately (and eventually have a different interface for them in html-tools).

unhammer commented 11 months ago

yeah some kind of naming convention would do it; apy just looks in the modes folder for pairs names xyz-xyz (where xyz can include underscores)

(Though ideally we'd have metadata associated with modes files to make the detection more robust.)

jonorthwash commented 11 months ago

Agreed about metadata.

The big-picture question is whether it makes sense to treat orthographic conversion modes separately from translation modes. Conceptually they're pretty similar.