linuxscout / pyarabic

pyarabic
GNU General Public License v3.0
450 stars 85 forks source link

Normalization of Number words #32

Closed mAboshokor closed 4 years ago

mAboshokor commented 5 years ago

most of the number words found in Arabic text is usually normalized as Arabic speakers use words like "الف" instead of "ألف" and since the modulo already includes various normalization methods it is saner to use the normalized version of the word instead of the original form

linuxscout commented 5 years ago

Thanks for this proposition, but The word "الف" is mistaken, I think we can allow it as a parameter, but not as a correct form, this will lead to extract wrong phrases. I think, we can add a default parameter to function, which enable normalization for normalized words.

mAboshokor commented 5 years ago

Hi Dr. can you please recheck the PR at #33

linuxscout commented 5 years ago

I'm reviewing it.