OleMchls / atom-wordcount

Counts the words in your current document
https://atom.io/packages/wordcount
MIT License
38 stars 27 forks source link

Feature Request: Latex Word Count #65

Open nbrewk opened 8 years ago

nbrewk commented 8 years ago

In same way that word counts are supported for Markdown files, it would be ideal for wordcount to be able to ignore LaTeX syntax and provide an accurate word count.

OleMchls commented 8 years ago

Great idea, I'd love to see a PR for that 👍

Aerijo commented 6 years ago

@nbrewk @OleMchls IMO the best way to wordcount LaTeX files is to use a dedicated tool like TeXcount or wordcount.tex. These would require significant changes / additions to this package and it's scope, so I've written a package dedicated to adding LaTeX wordcount support. Right now, it can do document, file, section, and selection worcounting. I would be interested in providing the results to this package though, so it can display a summary in the status bar (right now I just make a notification containing the TeXcount output, with some basic formatting).

OleMchls commented 6 years ago

@Aerijo Generally I think having smarter counting per file format is a good idea. However, we would need to have a possibility to have different functions per file extension. If you want to contribute this I will happily coordinate with you how to archive this best.

Given we succeed with this approach I would happily take a super advanced tex counting module/function for the wordcount package.

josh-bridge commented 6 years ago

If this is not possible in the short-term, as I am using it now the main inaccuracy seems to come from the inclusion of words in comments. Is there an easy language-nonspecific way to not include them in the count or does the atom code make that difficult to do?

OleMchls commented 6 years ago

If this is not possible in the short-term, as I am using it now the main inaccuracy seems to come from the inclusion of words in comments. Is there an easy language-nonspecific way to not include them in the count or does the atom code make that difficult to do?

@josh-bridge IIRC atom has an inbuild parser, so I think technically it would be possible to detect comments and ignore them. However, you would need to get this metadata from the Atom API. Which is not really used for counting (yet).

mbroedl commented 6 years ago

@josh-bridge can you try whether mbroedl/atom-wordcount works for you (cf. #99)? I added an option to exclude specific scopes or only count specific scopes. For markdown a simple comment.* would therefore ignore all comments. I don't know much about the latex tokenisation, but I hope you should be able to do what you planned? I will create a PR when people are happy.