apertium / apertium-html-tools

Web application providing a fully localised interface for text/website/document translation, analysis and generation powered by Apertium.
http://wiki.apertium.org/wiki/Apertium-html-tools
GNU General Public License v3.0
39 stars 90 forks source link

Subreadings not displayed correctly #8

Open ghost opened 9 years ago

ghost commented 9 years ago

Current output

китапмы ↬ +мы ↤ qst __китап ↤ n ⋅ nom __китап ↤ n ⋅ nom __китап ↤ n ⋅ nom __+и ↤ cop ⋅ p3 ⋅ pl __китап ↤ n ⋅ nom __+и ↤ cop ⋅ p3 ⋅ sg __китап ↤ n ⋅ nom

Expected output

китапмы ↬ китап ↤ n ⋅ nom __+мы ↤ qst __китап ↤ n ⋅ nom __+и ↤ cop ⋅ p3⋅ pl +мы ↤ qst __китап ↤ n ⋅ nom __+и ↤ cop ⋅ p3 ⋅ sg +мы ↤ qst

On the "Morphological Analysis" subpage of turkic.apertium.org [1], when I analyse words which analyses contain subreadinns (start with a "+" in apertium format), order of readings gets mixed up (in particular, subreadings get displayed on top, and main readings below them and indented).

In the output of all apertium-turkic transducers, main reading is the left-most one. In the schreenshot attached, "китап" is the main reading, "и" (if there) is the first sub-reading, and "мы" is the last. They should be displayed in that order -- main reading on top, followed by subreadings each on a separate line and indented.

The way morphological analyses are displayed on the website resembles the vislcg format. If cg-conv is used to convert the apertium stream format into vislcg format, then it's simply the matter of providing the -l option to cg-conv:

apertium-tat$ echo "китапмы" | apertium -d . tat-morph | cg-conv -a -l

[1] http://turkic.apertium.org/index.eng.html?choice=tat#analyzation

bildschirmfoto vom 2015-01-29 20 52 00

jonorthwash commented 8 years ago

I believe some languages prefer the current directionality (it's specified somewhere in the module). Perhaps the interface could support both, and even have an easy way to reverse it from the default?

jonorthwash commented 7 years ago

Here's another example:

selection_008

The analysis the transducer is providing is ^yapabilir misin/yap<v><tv><abil><aor>+mı<qst>+i<cop><aor><p2><sg>$, which means we should get something like this:

yapabilir misin ↬ yap ↤ v ⋅ tv ⋅ abil ⋅ aor +mı ↤ qst ____ +i ↤ cop ⋅ aor ⋅ p2 ⋅ sg