Open nbrewk opened 8 years ago
Great idea, I'd love to see a PR for that 👍
@nbrewk @OleMchls IMO the best way to wordcount LaTeX files is to use a dedicated tool like TeXcount
or wordcount.tex
. These would require significant changes / additions to this package and it's scope, so I've written a package dedicated to adding LaTeX wordcount support. Right now, it can do document, file, section, and selection worcounting. I would be interested in providing the results to this package though, so it can display a summary in the status bar (right now I just make a notification containing the TeXcount output, with some basic formatting).
@Aerijo Generally I think having smarter counting per file format is a good idea. However, we would need to have a possibility to have different functions per file extension. If you want to contribute this I will happily coordinate with you how to archive this best.
Given we succeed with this approach I would happily take a super advanced tex counting module/function for the wordcount
package.
If this is not possible in the short-term, as I am using it now the main inaccuracy seems to come from the inclusion of words in comments. Is there an easy language-nonspecific way to not include them in the count or does the atom code make that difficult to do?
If this is not possible in the short-term, as I am using it now the main inaccuracy seems to come from the inclusion of words in comments. Is there an easy language-nonspecific way to not include them in the count or does the atom code make that difficult to do?
@josh-bridge IIRC atom has an inbuild parser, so I think technically it would be possible to detect comments and ignore them. However, you would need to get this metadata from the Atom API. Which is not really used for counting (yet).
@josh-bridge can you try whether mbroedl/atom-wordcount works for you (cf. #99)?
I added an option to exclude specific scopes or only count specific scopes. For markdown a simple comment.*
would therefore ignore all comments. I don't know much about the latex tokenisation, but I hope you should be able to do what you planned?
I will create a PR when people are happy.
In same way that word counts are supported for Markdown files, it would be ideal for wordcount to be able to ignore LaTeX syntax and provide an accurate word count.