Closed mansayk closed 5 years ago
I think this one is also connected with this case:
echo "валерьян" | apertium-destxt -n | lt-proc -z -w 'apertium-tat/tat.automorf.bin' | cg-proc -z 'apertium-tat/tat.rlx.bin' | cg-proc -z -w -1 'apertium-tat/dev/mansur.bin' | apertium-retxt
^валерьян/валерьян<n><sg><nom>$
echo "валерьянны" | apertium-destxt -n | lt-proc -z -w 'apertium-tat/tat.automorf.bin' | cg-proc -z 'apertium-tat/tat.rlx.bin' | cg-proc -z -w -1 'apertium-tat/dev/mansur.bin' | apertium-retxt
^валерьянны/*валерьянны$
echo "вареньегыз" | apertium-destxt -n | lt-proc -z -w 'apertium-tat/tat.automorf.bin' | cg-proc -z 'apertium-tat/tat.rlx.bin' | cg-proc -z -w -1 'apertium-tat/dev/mansur.bin' | apertium-retxt
^вареньегыз/*вареньегыз$
echo "варенье" | apertium-destxt -n | lt-proc -z -w 'apertium-tat/tat.automorf.bin' | cg-proc -z 'apertium-tat/tat.rlx.bin' | cg-proc -z -w -1 'apertium-tat/dev/mansur.bin' | apertium-retxt
^варенье/варенье<n><sg><nom>$
echo "декольтесы" | apertium-destxt -n | lt-proc -z -w 'apertium-tat/tat.automorf.bin' | cg-proc -z 'apertium-tat/tat.rlx.bin' | cg-proc -z -w -1 'apertium-tat/dev/mansur.bin' | apertium-retxt
^декольтесы/*декольтесы$
echo "декольте" | apertium-destxt -n | lt-proc -z -w 'apertium-tat/tat.automorf.bin' | cg-proc -z 'apertium-tat/tat.rlx.bin' | cg-proc -z -w -1 'apertium-tat/dev/mansur.bin' | apertium-retxt
^декольте/декольте<n><sg><nom>$
echo "дельфинны" | apertium-destxt -n | lt-proc -z -w 'apertium-tat/tat.automorf.bin' | cg-proc -z 'apertium-tat/tat.rlx.bin' | cg-proc -z -w -1 'apertium-tat/dev/mansur.bin' | apertium-retxt
^дельфинны/*дельфинны$
echo "дельфин" | apertium-destxt -n | lt-proc -z -w 'apertium-tat/tat.automorf.bin' | cg-proc -z 'apertium-tat/tat.rlx.bin' | cg-proc -z -w -1 'apertium-tat/dev/mansur.bin' | apertium-retxt
^дельфин/дельфин<n><sg><nom>$
This list is going to be endless... So I think some rules should be fixed and it will fix other similar problems.
@mansayk, Could you make a short table like this for all the examples you find? It'll help me sort out what the problems are.
analysis | expected form | current form |
---|---|---|
бунтарь<n><sg><nom> |
бунтарьләр | #бунтарь |
Could you add in the forms from #24, #25, and #26 also?
Analysis for base form | correct form that not analyzed |
---|---|
^бунтарь/бунтарь<n><sg><nom>$ |
^бунтарьлар/*бунтарьлар$ |
^валерьян/валерьян<n><sg><nom>$ |
^валерьянны/*валерьянны$ |
^варенье/варенье<n><sg><nom>$ |
^вареньегыз/*вареньегыз$ |
^декольте/декольте<n><sg><nom>$ |
^декольтесы/*декольтесы$ |
^дельфин/дельфин<n><sg><nom>$ |
^дельфинны/*дельфинны$ |
^конъюнктивит/конъюнктивит<n><sg><nom>$ |
^конъюнктивитны/*конъюнктивитны$ |
^объектив/объектив<adj>$ |
^объективрак/*объективрак$ |
^шәфәкъ/шәфәкъ<n><sg><nom>$ |
^шәфәкъны/*шәфәкъны$ |
^шәфәкъ/шәфәкъ<n><sg><nom>$ |
^шәфәгы/*шәфәгы$ |
Could you add the currently output forms to the table too? This is important for sorting the patterns that need to be fixed.
It seems a lot of these stems were simply miscategorised, and that there was no problem with the phonology related to those forms. Please have a look at my two commits (61ae48f, b704f55) for future reference.
To finish addressing these issues, could you give me the currently output forms for the remaining forms that are having trouble? It would probably be good to put the forms of шәфәкъ in #26 and the rest here.
Analysis for base form | correct form that not analyzed |
---|---|
^валерьян/валерьян<n><sg><nom>$ |
^валерьянны/*валерьянны$ |
^шәфәкъ/шәфәкъ<n><sg><nom>$ |
^шәфәкъны/*шәфәкъны$ |
^шәфәкъ/шәфәкъ<n><sg><nom>$ |
^шәфәгы/*шәфәгы$ |
Could you test what the currently output forms are for those? I can do it too, but this is something that will be useful for you in the future.
I'm sorry, could you give me an example?
Analysis | Expected form | Current form |
---|---|---|
валерьян<n><sg><acc> |
валерьянны |
валерьянне |
^шәфәкъ/шәфәкъ<n><sg><nom>$ |
шәфәкъны , шәфәгы |
^шәфәкъне , шәфәге |
^календарь/календарь<n><sg><nom>$ |
календаремны |
календаремнең |
^бильярд/бильярд<n><sg><nom>$ |
бильярдны |
бильярдне |
^бөтендөнья/бөтендөнья<adj>$ |
бөтендөньяга |
бөтендөньягә |
@jonorthwash, is it correct now?
@mansayk, I edited the first line of the table to a much more useful format. Could you convert the other lines to this format?
Thank you, done!