sohaibafifi / languagetool

A fork of languagetool to maintain Arabic
https://languagetool.org
GNU Lesser General Public License v2.1
18 stars 1 forks source link

Punctuation check #48

Closed linuxscout closed 3 years ago

linuxscout commented 3 years ago

Salam, The LT doesn't detect some typo for arabic like this, spaces before some punctuation:

قال : سلاما السلام ؟ ما أحسن البيت ! تحية ؛

We need also to add punct varaitions like ؟ ؛ This issue needs to extend some classes like WhitespacesBeforePunctuationRule. Thanks

linuxscout commented 3 years ago

I succeeded to extend WhitespacesBeforePunctuationRule. I am working on question mark

linuxscout commented 3 years ago

Salam, I remove WhitespacesBeforePunctuationRule. and replaced it by a new class ArabicPunctuationWhitespaceRule, but I have some problems, I changed the default the following files

languagetool-core/src/main/java/org/languagetool/AnalyzedTokenReadings.java
languagetool-core/src/main/java/org/languagetool/tokenizers/WordTokenizer.java

I added arabic question mark, comma and semi colon to Tokenzing chars.

I think that I do this by the bad way, can you review it? Thanks