Open mansayk opened 5 years ago
I think the <err_orth>
approach might make sense for archaic forms? Or should we do something different?
I think err_orth
would be ok, or maybe use_arch
?
So, I make it this way in lexc file: китаб:китаб N1 ; ! "use_arch" or китаб:китаб N1 ; ! "err_orth" right?
! Use/Arch
! Err/Orth
I tried to use
китаб:китаб N1 ; ! "" ! Use/Arch
but it doesn't have any effect.
echo 'китаб' | apertium-destxt -n | lt-proc -z -w 'apertium-tat/tat.automorf.bin' | cg-proc -z 'apertium-tat/tat.rlx.bin' | cg-proc -z -w -1 'apertium-tat/dev/mansur.bin' | apertium-retxt ^китаб/китаб<n><sg><nom>$
There is no additional tag or any other difference.
@mansayk what should the result be ?
Let's suppose we have word "китабым". And as lemma I get "китаб" instead of correct "китап". And additional tag
@ftyers, @mansayk, you seem not to be communicating well. Let me try to help...
@mansayk, I believe that the Use/Arch
functionality does not yet exist, and @ftyers is offering to implement it. If it were to work as expected, I believe the correct entry would be the following:
китап:китаб N1 ; ! "book" ! Use/Arch
@ftyers, I believe you'd want the output of analysis to be something like the following. @mansayk, could you confirm that this makes sense to you as well?
^китаб/китап<n><sg><nom><use_arch>$
Yes,
^китаб/китап<n><sg><nom><use_arch>$
seems good for me. It gives correct lemma and has special tag 'use_arch'.
@IlnarSelimcan, I commented the lemma "китаб" (only "китап"), because it is not valid in modern Tatar language. If you want to use it for some old texts, maybe there are some special markers to exclude them from compiling in regular mode. I'm sure there are other archaic/historical words that we should take care of. @jonorthwash, @ftyers, what is the best way here?