isaaclyman / novel-word-count-obsidian

Obsidian plugin. Displays a word count or other statistic for each file, folder and vault in the File Explorer pane.
https://obsidian.md/plugins?id=novel-word-count
MIT License
86 stars 8 forks source link

Inaccurate counting of mixed Chinese and English #71

Closed alchemy-lee closed 10 months ago

alchemy-lee commented 10 months ago

Hi, when I write mixed Chinese and English in the note, the counting of novel-word-count is not accurate. For example:

hello 世界世界世界 hello hello

obsidian status bar:

image

novel-word-count:

image

The actual word count is 9 words, but novel-word-count report only 6.

isaaclyman commented 10 months ago

For performance reasons, I've been counting each note as space-delimited or CJK, rather than trying to incorporate both into one RegEx. This is the first complaint I've had, but it doesn't look like it would be too difficult to combine them and possibly remove the setting to switch between them. I'll try it and see how it affects performance.

isaaclyman commented 10 months ago

Please note, however, that the word count will likely never match up between Obsidian's status bar and Novel Word Count. See #38.

isaaclyman commented 10 months ago

After some research, I'm going to say no to this one for now. Counts for mixed-language notes would require a much more advanced RegEx, versus the simple, fast, and easily understandable ones I use now. You'll have to decide which counting method you prefer.

At some point I may experiment with a custom lexer and see if it can keep up with String.prototype.match while also allowing higher logic than a RegEx, but no promises.

isaaclyman commented 9 months ago

This has been fixed in v3.4.0. Mixed-language notes should now be counted correctly.