apertium / apertium-tat

Apertium linguistic data for Tatar
GNU General Public License v3.0
4 stars 3 forks source link

"китаб" instead of "китап" #28

Open mansayk opened 5 years ago

mansayk commented 5 years ago

@IlnarSelimcan, I commented the lemma "китаб" (only "китап"), because it is not valid in modern Tatar language. If you want to use it for some old texts, maybe there are some special markers to exclude them from compiling in regular mode. I'm sure there are other archaic/historical words that we should take care of. @jonorthwash, @ftyers, what is the best way here?

jonorthwash commented 5 years ago

I think the <err_orth> approach might make sense for archaic forms? Or should we do something different?

ftyers commented 5 years ago

I think err_orth would be ok, or maybe use_arch ?

mansayk commented 5 years ago

So, I make it this way in lexc file: китаб:китаб N1 ; ! "use_arch" or китаб:китаб N1 ; ! "err_orth" right?

ftyers commented 5 years ago
! Use/Arch

! Err/Orth
mansayk commented 5 years ago

I tried to use китаб:китаб N1 ; ! "" ! Use/Arch but it doesn't have any effect. echo 'китаб' | apertium-destxt -n | lt-proc -z -w 'apertium-tat/tat.automorf.bin' | cg-proc -z 'apertium-tat/tat.rlx.bin' | cg-proc -z -w -1 'apertium-tat/dev/mansur.bin' | apertium-retxt ^китаб/китаб<n><sg><nom>$ There is no additional tag or any other difference.

ftyers commented 5 years ago

@mansayk what should the result be ?

mansayk commented 5 years ago

Let's suppose we have word "китабым". And as lemma I get "китаб" instead of correct "китап". And additional tag here won't help, because I don't get correct lemma. I need to remove these words (marked as Err/Orth or Use/Arch) from analyze at all.

jonorthwash commented 5 years ago

@ftyers, @mansayk, you seem not to be communicating well. Let me try to help...

@mansayk, I believe that the Use/Arch functionality does not yet exist, and @ftyers is offering to implement it. If it were to work as expected, I believe the correct entry would be the following:

китап:китаб N1 ; ! "book"  ! Use/Arch

@ftyers, I believe you'd want the output of analysis to be something like the following. @mansayk, could you confirm that this makes sense to you as well?

^китаб/китап<n><sg><nom><use_arch>$
mansayk commented 5 years ago

Yes, ^китаб/китап<n><sg><nom><use_arch>$ seems good for me. It gives correct lemma and has special tag 'use_arch'.