crsh / papaja

papaja (Preparing APA Journal Articles) is an R package that provides document formats to produce complete APA manuscripts from RMarkdown-files (PDF and Word documents) and helper functions that facilitate reporting statistics, tables, and plots.
https://frederikaust.com/papaja_man/
Other
657 stars 133 forks source link

[Feature request] Automate YAML wordcount #403

Open levolz opened 4 years ago

levolz commented 4 years ago

Hi, it would be great if there's a way to automate the wordcount specified in the YAML. It's always a small pierce in the heart to knit a document, see the Lua filter wordcount, and then having to go back to insert that just to knit again.

Cheers, Leonhard [this issue was supported by the communicative power of Twitter]

conig commented 4 years ago

Could the lua filter be modified to replace a string like '{{wordcount}}' with the calculated count? Or is the header containing the word count not yet appended to the rest of the document?

crsh commented 4 years ago

Yes, something like this is would be possible. Another possibility would be to check if the meta field wordcount is missing and if so insert the result of the word count. I think this would be fairly straight forward.

What makes this more difficult to implement, is that, for historical reasons, keywords and word count are currently handled by the R-based preprocessor. So an implementation of this feature would need to translate this bit of code to Lua as well:

https://github.com/crsh/papaja/blob/1f124807b6acad09aaae156f0fb9d59590e6650c/R/apa6_formats.R#L458-L464

This is definitely doable and I'd be happy to accept a PR if anyone wants to take a crack a this (either just adding the word count to the meta field or both).

crsh commented 3 years ago

This may be a useful approach: https://twitter.com/pandoc_tips/status/1357102818298130435

Want to add a new element to Markdown without resorting to spans? Use my text. That's read as double-emphasized text and never used in normal text. Filter with

function Emph (e) if #e.content == 1 and e.content[1].t == 'Emph' then -- use e1.content[1].content end end

conig commented 3 years ago

This isn't a final solution, but if anyone is itching for this you can add this code to 'wordcount.lua' at line 105

  file = io.open("wordcount.tmp", "w")
  file:write(body_words)
  file:close()

This will output the wordcount to a file in your working directory. Then you can set your wordcount value to:

`r readLines('wordcount.tmp', warn = FALSE)`

Unfortunately this will result in your document having the previous version's wordcount, so you'll need to run knit twice when you need the word count to be brought up to date.

crsh commented 3 years ago

Thanks for sharing this solution. I'll be sure to revisit it when I find the time to tackle this issue.