ropensci / unconf18

http://unconf18.ropensci.org/
44 stars 4 forks source link

Incorporate word doc track changes back into R markdown #76

Open goldingn opened 6 years ago

goldingn commented 6 years ago

I know a lot of folks who are put off using R markdown for reports and scientific papers because they know it'll be hard to incorporate the inevitable Word doc tracked changes they'll get back from their collaborators/supervisors.

I think pandoc could be used to convert both the original and edited doc back to markdown, and that could then be diffed to find the changes in text part of the markdown (if it's a word doc, only text will have changed). It may then be possible to detect how those bits of text correspond to the R markdown document, and apply the changes there.

It would also be nice to automate rendering of the changes. That can be done for pdfs via latexdiff: http://timotheepoisot.fr/2014/07/10/markdown-track-changes/ HTML might be trickier.

goldingn commented 6 years ago

This is a whim and not something I've thought much about. If someone says "this has already been done" or "this is not possible" then I'll be very happy to have my choice of project narrowed :)

goldingn commented 6 years ago

Somewhat related to #73

goldingn commented 6 years ago

OK, so this definitely duplicates part of #42. In fact it probably serves as a half-decent summary of another project idea from that thread: automating @mmulvahill's workflow for dealing with track changes from word

goldingn commented 6 years ago

It may then be possible to detect how those bits of text correspond to the R markdown document, and apply the changes there.

I'm thinking you could just grep() for text matching that which changed in the original document, grabbing some sentences on either side to disambiguate multiple matches.

goldingn commented 6 years ago

a stretch goal could be to incorporate word comments into the R markdown using this handy little trick

goldingn commented 6 years ago

I talk to myself in GitHub issues quite a lot.

mmulvahill commented 6 years ago

So I started about here:

This is a whim and not something I've thought much about.

But this sounds convincing the more I think about it:

I'm thinking you could just grep() for text matching that which changed in the original document, grabbing some sentences on either side to disambiguate multiple matches.

Questions I'm thinking about: Would an accept/reject prompt in the REPL for changed text simplify the process enough to be beneficial? How do we handle inline chunks? How would this fit into most folks overall workflow?

goldingn commented 6 years ago

Yeah exactly, I was imagining we could detect all the changes first, let the user know how many there were, and give the option to step through them all, show a nice diff for each and prompt the user to accept or reject. Providing an 'accept all' option would be useful too.

Yeah inline code chunks could be tricky, and I imagine there will be plenty of other gotchas too!

We might be able to spot inline chunks in the original document and handle them separately. We could spot them either by searching for `r*`, or based on differences between the .Rmd and .md files (in a similar way as for tracked changes). The internals of knitr may have some some of the tools mapped out; I've not looked inside it before.

goldingn commented 6 years ago

This is a really nice example of a non-REPL prompt: https://github.com/dreamRs/prefixer !

mmulvahill commented 6 years ago

Nice use of the viewer! It sounds like most or all of the pieces we would need are available. And like it would fit well into a collaborative statisticians workflow.

The way these issues are going I'll be convinced to use rstudio again by wednesday. 🙂

zachary-foster commented 6 years ago

I like this idea. The crayon package could also be used to make attractive diffs on the R console for those that dont use rstudio.