apertium / apertium-tat

Apertium linguistic data for Tatar
GNU General Public License v3.0
4 stars 3 forks source link

бунтарь, бунтарьлар #23

Closed mansayk closed 5 years ago

mansayk commented 5 years ago
echo "бунтарьлар" | apertium-destxt -n | lt-proc -z -w 'apertium-tat/tat.automorf.bin' | cg-proc -z 'apertium-tat/tat.rlx.bin' | cg-proc -z -w -1 'apertium-tat/dev/mansur.bin' | apertium-retxt
^бунтарьлар/*бунтарьлар$

echo "бунтарь" | apertium-destxt -n | lt-proc -z -w 'apertium-tat/tat.automorf.bin' | cg-proc -z 'apertium-tat/tat.rlx.bin' | cg-proc -z -w -1 'apertium-tat/dev/mansur.bin' | apertium-retxt
^бунтарь/бунтарь<n><sg><nom>$
mansayk commented 5 years ago

I think this one is also connected with this case:

echo "валерьян" | apertium-destxt -n | lt-proc -z -w 'apertium-tat/tat.automorf.bin' | cg-proc -z 'apertium-tat/tat.rlx.bin' | cg-proc -z -w -1 'apertium-tat/dev/mansur.bin' | apertium-retxt
^валерьян/валерьян<n><sg><nom>$

echo "валерьянны" | apertium-destxt -n | lt-proc -z -w 'apertium-tat/tat.automorf.bin' | cg-proc -z 'apertium-tat/tat.rlx.bin' | cg-proc -z -w -1 'apertium-tat/dev/mansur.bin' | apertium-retxt
^валерьянны/*валерьянны$
mansayk commented 5 years ago
echo "вареньегыз" | apertium-destxt -n | lt-proc -z -w 'apertium-tat/tat.automorf.bin' | cg-proc -z 'apertium-tat/tat.rlx.bin' | cg-proc -z -w -1 'apertium-tat/dev/mansur.bin' | apertium-retxt
^вареньегыз/*вареньегыз$

echo "варенье" | apertium-destxt -n | lt-proc -z -w 'apertium-tat/tat.automorf.bin' | cg-proc -z 'apertium-tat/tat.rlx.bin' | cg-proc -z -w -1 'apertium-tat/dev/mansur.bin' | apertium-retxt
^варенье/варенье<n><sg><nom>$
mansayk commented 5 years ago
echo "декольтесы" | apertium-destxt -n | lt-proc -z -w 'apertium-tat/tat.automorf.bin' | cg-proc -z 'apertium-tat/tat.rlx.bin' | cg-proc -z -w -1 'apertium-tat/dev/mansur.bin' | apertium-retxt
^декольтесы/*декольтесы$

echo "декольте" | apertium-destxt -n | lt-proc -z -w 'apertium-tat/tat.automorf.bin' | cg-proc -z 'apertium-tat/tat.rlx.bin' | cg-proc -z -w -1 'apertium-tat/dev/mansur.bin' | apertium-retxt
^декольте/декольте<n><sg><nom>$
mansayk commented 5 years ago
echo "дельфинны" | apertium-destxt -n | lt-proc -z -w 'apertium-tat/tat.automorf.bin' | cg-proc -z 'apertium-tat/tat.rlx.bin' | cg-proc -z -w -1 'apertium-tat/dev/mansur.bin' | apertium-retxt
^дельфинны/*дельфинны$

echo "дельфин" | apertium-destxt -n | lt-proc -z -w 'apertium-tat/tat.automorf.bin' | cg-proc -z 'apertium-tat/tat.rlx.bin' | cg-proc -z -w -1 'apertium-tat/dev/mansur.bin' | apertium-retxt
^дельфин/дельфин<n><sg><nom>$
mansayk commented 5 years ago

This list is going to be endless... So I think some rules should be fixed and it will fix other similar problems.

jonorthwash commented 5 years ago

@mansayk, Could you make a short table like this for all the examples you find? It'll help me sort out what the problems are.

analysis expected form current form
бунтарь<n><sg><nom> бунтарьләр #бунтарь
jonorthwash commented 5 years ago

Could you add in the forms from #24, #25, and #26 also?

mansayk commented 5 years ago
Analysis for base form correct form that not analyzed
^бунтарь/бунтарь<n><sg><nom>$ ^бунтарьлар/*бунтарьлар$
^валерьян/валерьян<n><sg><nom>$ ^валерьянны/*валерьянны$
^варенье/варенье<n><sg><nom>$ ^вареньегыз/*вареньегыз$
^декольте/декольте<n><sg><nom>$ ^декольтесы/*декольтесы$
^дельфин/дельфин<n><sg><nom>$ ^дельфинны/*дельфинны$
^конъюнктивит/конъюнктивит<n><sg><nom>$ ^конъюнктивитны/*конъюнктивитны$
^объектив/объектив<adj>$ ^объективрак/*объективрак$
^шәфәкъ/шәфәкъ<n><sg><nom>$ ^шәфәкъны/*шәфәкъны$
^шәфәкъ/шәфәкъ<n><sg><nom>$ ^шәфәгы/*шәфәгы$
jonorthwash commented 5 years ago

Could you add the currently output forms to the table too? This is important for sorting the patterns that need to be fixed.

jonorthwash commented 5 years ago

It seems a lot of these stems were simply miscategorised, and that there was no problem with the phonology related to those forms. Please have a look at my two commits (61ae48f, b704f55) for future reference.

To finish addressing these issues, could you give me the currently output forms for the remaining forms that are having trouble? It would probably be good to put the forms of шәфәкъ in #26 and the rest here.

mansayk commented 5 years ago
Analysis for base form correct form that not analyzed
^валерьян/валерьян<n><sg><nom>$ ^валерьянны/*валерьянны$
^шәфәкъ/шәфәкъ<n><sg><nom>$ ^шәфәкъны/*шәфәкъны$
^шәфәкъ/шәфәкъ<n><sg><nom>$ ^шәфәгы/*шәфәгы$
jonorthwash commented 5 years ago

Could you test what the currently output forms are for those? I can do it too, but this is something that will be useful for you in the future.

mansayk commented 5 years ago

I'm sorry, could you give me an example?

mansayk commented 5 years ago
Analysis Expected form Current form
валерьян<n><sg><acc> валерьянны валерьянне
^шәфәкъ/шәфәкъ<n><sg><nom>$ шәфәкъны, шәфәгы ^шәфәкъне, шәфәге
^календарь/календарь<n><sg><nom>$ календаремны календаремнең
^бильярд/бильярд<n><sg><nom>$ бильярдны бильярдне
^бөтендөнья/бөтендөнья<adj>$ бөтендөньяга бөтендөньягә

@jonorthwash, is it correct now?

jonorthwash commented 5 years ago

@mansayk, I edited the first line of the table to a much more useful format. Could you convert the other lines to this format?

mansayk commented 5 years ago

Thank you, done!