apertium / apertium-turkic

For code, data and issues relating to all Turkic monolingual packages or Turkic-X translators.
0 stars 0 forks source link

consider using <mod> instead of <qst> #15

Open jonorthwash opened 5 years ago

jonorthwash commented 5 years ago

<qst> is not a part of speech tag used in Apertium, and I see no reason to categorise -mI/MA differently from -ǵo/Goy/ku/etc.

jonorthwash commented 5 years ago

@ftyers says that the first argument isn't valid: the tagset page is "is completely out of date and unreliable", and <qst> is widely used.

He also points out that the distributions might be different. E.g., -mI and ya in Turkish have [slightly] different distributions:

I don't think the latter is a good argument for classifying them as different parts of speech. Perhaps it justifies different subcategory tags, though?

In any case, we need to discuss the distribution of different modal particles and the tags we use for them. In Kazakh we have these, for example:

%+ау%<mod_emo%>:%-ау # ;
%+ай%<mod_emo%>:%-ай # ;
%+ғой%<mod_ass%>:% %{G%}ой # ;
%+гөр%<mod_ass%>:% гөр # ;
%+ма%<qst%>:%>% %{M%}%{A%} # ;
ма:%~ма QST ; ! "" Dir/RL
ма:ма QST ; ! "" Dir/LR
ма:ба QST ; ! "" Dir/LR
ма:па QST ; ! "" Dir/LR
ма:ме QST ; ! "" Dir/LR
ма:бе QST ; ! "" Dir/LR
ма:пе QST ; ! "" Dir/LR
ма% не:%~ма% не QST ; ! "" Dir/RL
ма% не:ма% не QST ; ! "" Dir/LR
ма% не:ме% не QST ; ! "" Dir/LR
ше:ше QST ; ! ""
ғой:%~ғой MOD-ASS ; ! "" Dir/RL
гөр:гөр MOD-ASS ; ! ""
шығар:шығар MOD ;

In Kyrgyz, we have

%+беле%<qst%>:% беле # ;
%+бекен%<qst%>:% бекен # ;
%+чы%<qst%>:%>ч%{I%} # ;
%+го%<mod_ass%>:% го # ;
%+да%<mod_tru%>:% да # ;
%+дыр%<mod_ind%>:%>%{D%}%{I%}р # ;
%+имиш%<mod_dub%>:% имиш # ;
бы%<qst%>:%~бы # ; ! Dir/RL 
бы%<qst%>:бы # ; ! Dir/LR 
бекен%<qst%>:%~бекен # ; ! Dir/RL
бекен%<qst%>:бекен # ; ! Dir/LR
го%<mod_ass%>:%~го # ; ! Dir/RL
го%<mod_ass%>:го # ; ! Dir/LR
чы%<qst%>:чы # ; 
сыяктуу:сыяктуу POST-DECL ;

And окшо as just a verb.

jonorthwash commented 5 years ago

I think this should probably apply for <emph> too. See https://github.com/apertium/apertium-chv/issues/17 .