apertium / apertium-tat

Apertium linguistic data for Tatar
GNU General Public License v3.0
4 stars 3 forks source link

Locative and dative can take comparative suffix sometimes #5

Open ftyers opened 6 years ago

ftyers commented 6 years ago

өстәрәк

e.g. in:

Елга Ница елгасының уң ярына тамагыннан 126 км өстәрәк кушыла.

From the frequency list:

$ sh hitparade.sh | grep дарак
     ^24/24<num>/24<num><subst><nom>$ ^алдарак/*алдарак$
      ^6/6<num>/6<num><subst><nom>$ ^янындарак/*янындарак$
      ^4/4<num>/4<num><subst><nom>$ ^уңдарак/*уңдарак$
      ^4/4<num>/4<num><subst><nom>$ ^сулдарак/*сулдарак$
      ^4/4<num>/4<num><subst><nom>$ ^Алдарак/*Алдарак$
      ^2/2<num>/2<num><subst><nom>$ ^төньягындарак/*төньягындарак$
      ^2/2<num>/2<num><subst><nom>$ ^Айдарак/*Айдарак$
      ^1/1<num>/1<num><subst><nom>$ ^ягындарак/*ягындарак$
      ^1/1<num>/1<num><subst><nom>$ ^кырындарак/*кырындарак$
      ^1/1<num>/1<num><subst><nom>$ ^кырыйдарак/*кырыйдарак$
      ^1/1<num>/1<num><subst><nom>$ ^көньягындарак/*көньягындарак$
      ^1/1<num>/1<num><subst><nom>$ ^көнчыгышындарак/*көнчыгышындарак$
jonorthwash commented 6 years ago

Are these real locative forms, or are they fixed adverbial and adjectival forms?

Also, where do you see dative forms?

mansayk commented 5 years ago

Locative

echo 'өстәрәк' | apertium -d apertium-tat tat-morph
^өстәрәк/*өстәрәк$^./.<sent>$

Ablative

echo 'өстәнрәк' | apertium -d apertium-tat tat-morph
^өстәнрәк/*өстәнрәк$^./.<sent>$

Dative

echo 'өскәрәк' | apertium -d apertium-tat tat-morph
^өскәрәк/өскәрәк<adv>$^./.<sent>$
mansayk commented 5 years ago

Possible:

көньяктарак
эчкәрәк

Not possible:

китаптарак
IlnarSelimcan commented 5 years ago

However, I suggest adding <loc><comp>, <abl><comp> and <dat><comp> forms for all nouns, without trying to classify them manually. See tests/morphophonology/issue5.tsv. @ftyers @jonorthwash @mansayk ok?

mansayk commented 5 years ago

I agree with @IlnarSelimcan.

mansayk commented 5 years ago

There is another category taking these affixes sometimes: кысыбрак, күтәребрәк...

^басып/бас<v><tv><gna_perf>/бас<v><tv><prc_perf>/бас<v><iv><gna_perf>/бас<v><iv><prc_perf>$^./.<sent>$
mansayk commented 5 years ago

Some pronouns accepting this affix:

ничегрәк
болайрак
тегеләйрәк
шулайрак
алайрак

ниндирәк
шундыйрак
мондыйрак
андыйрак

безнеңчәрәк
сезнеңчәрәк
аларчарак
боларчарак
шуларчарак
mansayk commented 5 years ago

Postpositions:

кебегрәк
буендарак
буенарак
буйлабрак
каршынарак
каршындарак
сыманрак
тибындагырак
турындарак
хакындарак
хакындагырак
шикеллерәк
mansayk commented 5 years ago

Adverbs:

башкача
гаҗәебрәк
тирәнтенрәк
тулысынчарак
шыпыртрак
җиңелчәрәк

and many others

IlnarSelimcan commented 5 years ago

This all belong to issue5.tsv test case.

On 17 February 2019 14:35:09 GMT+03:00, Mansur Saykhunov notifications@github.com wrote:

Postpositions: кебегрәк буендарак буенарак буйлабрак каршынарак каршындарак сыманрак тибындагырак турындарак хакындарак хакындагырак шикеллерәк

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/apertium/apertium-tat/issues/5#issuecomment-464445580

-- Простите за краткость, создано в K-9 Mail.

jonorthwash commented 5 years ago

Some pronouns accepting this affix:

These aren't really pronouns—or maybe at best they're adverbial forms of pronouns.

However, I suggest adding <loc><comp>, <abl><comp> and <dat><comp> forms for all nouns, without trying to classify them manually.

I don't know that this is the best approach if some are truly ungrammatical. What do the Tatar grammarians say about this? Is there some particular distribution of which forms are okay and which aren't?

mansayk commented 5 years ago

These aren't really pronouns—or maybe at best they're adverbial forms of pronouns.

Currently Tatar tagger marks them as <prn>, for example:

бу%<prn%>%<dem%>%<nom%>:бу CLITICS-INCL-COP ;
бу%<prn%>%<dem%>%<gen%>:моның CLITICS-NO-COP ;
бу%<prn%>%<dem%>%<dat%>:моңа CLITICS-NO-COP ;
бу%<prn%>%<dem%>%<acc%>:моны CLITICS-NO-COP ;
бу%<prn%>%<dem%>%<abl%>:моннан CLITICS-INCL-COP ;
бу%<prn%>%<dem%>%<loc%>:монда CLITICS-INCL-COP ;
бу%<prn%>%<dem%>%<px%>:моныкы%{n%} CASES ;
бу%<prn%>%<dem%>%<loc%>:мондагы ATTR-SUBST ;
бу%<prn%>%<dem%>%<sim%>:мондый CASES ;
бу%<prn%>%<dem%>%<sim%>%<px3sp%>:мондые%{n%} CASES ; ! FIXME CHECK мондый%{I%}%{n%} instead?
бу%<prn%>%<dem%>%<sim%>%<pl%>:мондыйлар CASES ;
бу%<prn%>%<dem%>%<sim%>%<pl%>%<px3sp%>:мондыйлары%{n%} CASES ;
бу%<prn%>%<dem%>%<adv%>:болай CLITICS-INCL-COP ;
бу%<prn%>%<dem%>%<qnt%>:бу% кадәр CLITICS-NO-COP ;
бу%<prn%>%<dem%>%<qnt%>:монча CLITICS-NO-COP ; ! Dir/LR
бу%<prn%>%<dem%>%<px3sp%>:монысы%{n%} CASES ;
бу%<prn%>%<dem%>%<px3sp%>:бусы%{n%} CASES ; ! Dir/LR