MilesMcBain / markdrive

Edit Google docs in Markdown with a little help from #rstats
71 stars 5 forks source link

Retrieve comments from docs? #4

Open markdly opened 7 years ago

markdly commented 7 years ago

Following up on this thread: https://twitter.com/MilesMcBain/status/902677408532250624

It would be great to have a way to access the comments made in google docs. I can see this can be done using the existing google drive api https://github.com/markdly/googledrive-comment. But I'm not sure how best to incorporate this information with a google doc that has been converted to markdown / rmarkdown.

I wonder if it is better to take advantage of the markdrive approach where the google doc is converted to word (.docx) first before it is converted to markdown.

To do this I thought the pandoc option --track-changes=all might be a workable solution. This is taken from http://pandoc.org/MANUAL.html

--track-changes=accept|reject|all Specifies what to do with insertions, deletions, and comments produced by the MS Word "Track Changes" feature. accept (the default), inserts all insertions, and ignores all deletions. reject inserts all deletions and ignores insertions. Both accept and reject ignore comments. all puts in insertions, deletions, and comments, wrapped in spans with insertion, deletion, comment-start, and comment-end classes, respectively. The author and time of change is included. all is useful for scripting: only accepting changes from a certain reviewer, say, or before a certain date. This option only affects the docx reader.

I think changing this:

https://github.com/MilesMcBain/markdrive/blob/4b8c33cb616e7104cb016371019246d7fd5b247b/R/gd_interface.R#L51-L52

to this will enable it:

system(command = paste0("pandoc --track-changes=all -f docx -t markdown -o \"", remote_doc$name,".md\"", 
        " \"", remote_doc$local_path, "\"")) 
MilesMcBain commented 7 years ago

Thanks for this Mark! So this does something pretty useful. With this change, If I gdoc_checkout() a file with comments they make their way through to markdown like:

    <span class="comment-start" id="1" author="Miles McBain"
    date="2017-09-05T10:19:51Z">A comment \#2</span>
    TEXT THAT WAS COMMENTED ON. BLAH BLAH.
    <span class="comment-end" id="1"></span>

Now the return trip with this comment syntax doesn't work which is a problem. But perhaps there is the opportunity to do some pre/post processing of the .md file to make something nice.

On the pre-processing side: I'm wondering could the comments be transformed to a more markdown-esque syntax.

On the post-processing side: The comments could be parsed and re-inserted via the API when pushing.

I'll play around with some syntax ideas. If you know of any precedents I'll all ears.

MilesMcBain commented 7 years ago

Also I've pushed the addition of the flag to the track_changes branch.

markdly commented 7 years ago

Hi Miles, Apologies for taking so long to respond!

I haven't really seen anything robust in terms of handling comments and markdown except for things like this discussion https://stackoverflow.com/a/20885980/8475145

By 'more markdown-esque syntax', do you mean something easier on the eye than a heap of <span> tags? Something like this perhaps?

[//]: # (Comment-1. A comment)
    TEXT THAT WAS COMMENTED ON. BLAH BLAH.
    [//]: # (End-1)

Interestingly, if I convert a native word doc using a standalone install of pandoc the comments in my markdown appear like this:

I am a word document with [I am a comment]{.comment-start id="0"
author="Mark Dulhunty" date="2017-09-01T11:20:00Z"}a comment
[]{.comment-end id="0"}

I'm not very familiar with using pandoc directly. When I get the chance I'll have a look to see why your comments are converted differently to mine...