Open linuxscout opened 3 years ago
Salam, I noticed that the Tashkeel is doen't work for some cases:
ما تَحدَّثَ إلّا وقَال خيراً Non recognized ( word1 diacritized, word2 diacritized)
ما تَحدَّثَ إلا وقال خيراً Recognized ( word1 not diacritized, word2 not diacritized)
ما تَحدَّثَ إلا وقَالَّ خيراً Recognized ( word1 not diacritized, word2 diacritized)
If we look at rule;
<rule id="collo_0009_illa_wa" name="إلا و"> <pattern> <marker> <token>إلا</token> <token postag=".*;W.?.?" postag_regexp="yes"/> </marker> </pattern> <message>عبارة "إلّا و" يستحسن أن تقال: <suggestion>إلاّ <match no="2" regexp_match="^و" regexp_replace=""/></suggestion> </message> <example correction="إلاّ قال" type="incorrect"> ما تحدث <marker>إلا وقال</marker> خيرا </example> <example type="correct"> ما تَحدَّثَ إلاّ قال خيراً </example> </rule>
We can say that if the token is not inflected, i can't match diacritized words.
Salam, I noticed that the Tashkeel is doen't work for some cases:
ما تَحدَّثَ إلّا وقَال خيراً Non recognized ( word1 diacritized, word2 diacritized)
ما تَحدَّثَ إلا وقال خيراً Recognized ( word1 not diacritized, word2 not diacritized)
ما تَحدَّثَ إلا وقَالَّ خيراً Recognized ( word1 not diacritized, word2 diacritized)
If we look at rule;
We can say that if the token is not inflected, i can't match diacritized words.