noamross / redoc

[NOTE: Project in suspended animation for now] Reversible Reproducible Documents
https://noamross.github.io/redoc
Other
514 stars 44 forks source link

Set line wrap behavior in `options()` so it applied to RStudio Add-ins #47

Closed max-carey closed 5 years ago

max-carey commented 5 years ago

When I use redoc, unwanted line breaks are inserted that break up paragraphs in seemingly arbitrary ways for my new Rmarkdown document.

Here is what I do: In Rstudio I use File -> New File -> R Markdown and then select the redoc template. Before rendering this file using the knit button (knit to redoc) in R Studio, I add a paragraph of lorem ipsum text. This lorem ipsum paragraph is only on line 24 before rendering. Here is the file that I have before rendering to word: example.txt

After rendering the file I use the redoc add-in in Rstudio "dedoc to new file" which produces the following file exampleDeDoc.txt

In the second file, you can see that the loerm ipsum text now spans lines 24-29 because, that is, unwanted line breaks have been inserted.

Please take a moment and consider if the issue is actually in **knitr**, **rmarkdown**, **officer** or **pandoc**. Thanks! -->
Session Info ```r R version 3.5.3 (2019-03-11) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: macOS High Sierra 10.13.6 Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods [7] base loaded via a namespace (and not attached): [1] Rcpp_1.0.1 rstudioapi_0.10 knitr_1.23 [4] xml2_1.2.0 magrittr_1.5 uuid_0.1-2 [7] R6_2.4.0 rlang_0.3.4 httr_1.4.0 [10] tools_3.5.3 xfun_0.6.3 htmltools_0.3.6 [13] yaml_2.2.0 digest_0.6.19 crayon_1.3.4 [16] zip_2.0.1 whoami_1.3.0 officer_0.3.4 [19] base64enc_0.1-3 evaluate_0.14 mime_0.6 [22] rmarkdown_1.12 stringi_1.4.3 compiler_3.5.3 [25] diffobj_0.2.2.9003 redoc_0.1.0.9000 jsonlite_1.6 ``` Pandoc version (get with rmarkdown::pandoc_version): 2.5 RStudio version (if applicable): Version 1.2.1335
noamross commented 5 years ago

So this is expected behavior, but I don't think I've completely figured out the best way to manage it. Because only double line breaks count as paragraph breaks in markdown, the two documents have the exact same meaning, and the line breaks are not part of the information converted to the word doc. As such, we have to make some decisions as to the best way to convert back to text. These are defined in dedoc(), which has a wrap argument that defaults to 80, specifying that, by default, when converting from docx to markdown lines should be wrapped/broken at 80 character width.

There are a few other formatting things that have similar effects. For instance, headers can be specified either like this:

# Header

Or like this

Header
------

but dedoc() will always generate the latter.

Because of this, I had included the other Add-In, "Render and Update", which knits your document and replaces the .Rmd with what you get with dedoc(), so that only edits to the word document will result in more changes in the future.

Since you can't specify arguments to the RStudio plugin, it defaults to the defaults. But I think it would be a good idea, for instance, to let wrap and similar arguments be specified in options() so users can set their preferred formatting. I don't think we should aim to try to preserve the exact line breaks or similar non-semantically meaningful parts of the document. It'll be just too much of a lift.

In addition, I could probably find better ways to refer to this in documentation. Any ideas on the best places, or other changes to best deal with this?