olivierkes / manuskript

A open-source tool for writers
http://www.theologeek.ch/manuskript
GNU General Public License v3.0
1.75k stars 232 forks source link

Is there a way we can take some words off the word count? #1197

Open giseledute opened 1 year ago

giseledute commented 1 year ago

I was looking here in the “issues” tab, I saw that there is a person complaining about the dash not being counted as a word when placed between two words. I write in Portuguese and in my language it's not a word, it's a punctuation mark. So, my question is: would it be possible for us to add “words” that should be ignored in the count? Because I always have to put a bigger “goal” of words to cover the number of times I use a dash, usually about 100 words more, but it's never necessary. Anyway, that's my question! Thank you all in advance.

TheJackiMonster commented 1 year ago

In general I like the idea to configure how words are being counted depending on the language and even potentially style of writing. The issue with that is currently counting words is a potential bottleneck when it comes to responsiveness because we need to count all words in a file every time a single character changes.

The current implementation uses a regular expression for that which is a lot faster than previous implementations. However regular expressions have the downside to be rather complicated to adjust (it wouldn't be great for users to expose it directly, I assume).

So before we add the option to configure that, I think performance needs to improve. For example reducing the portion of text which needs to be processed after changes as well as delaying the counting asynchronously would help.