ropensci-archive / wishlist

:no_entry: ARCHIVED :no_entry:
https://discuss.ropensci.org/c/wishlist/6
50 stars 4 forks source link

mergr: Tools for mixed-workflow collaboration #12

Closed noamross closed 2 years ago

noamross commented 9 years ago

This is on my to-do list but putting it here might help me move it along. I may try to make some progress during the next Mozilla sprint.

The idea of mergr would be to leverage the googlesheets, driver, and rdrop2, and git2r packages to allow collaboration between people using git-based workflows with those using other services that have version control.

The essential functionality would be to import the full version history of a Google Doc/Dropbox file into a git repository, and update it as needed. This allows a git repository to contain the full project history, while allowing some collaborators to work on other preferred services more appropriate for single-file or concurrent workflows.

For instance, a non-programmer may use use Google Doc or an MS Word file to write text that is processed via ArchieML, or is providing data via a continually updated spreadsheet, and a collaborator could keep the version history of both in a git repository.

Other potential services to connect would be Etherpads or Dat. I don't know enough about Dat to know whether this would be an appropriate use-case, but it makes sense to start collaboration on an Etherpad and then move it into a git repository.

I have one proof of concept function here.

sckott commented 9 years ago

@noamross Sounds interesting. Do you have any real world use case at work to test this tool?

karthik commented 9 years ago

Hi @noamross I like this idea in principle but foresee a lot of challenges in practice. Correct me if I'm wrong, but in practice this seems like a one way information transfer. Activities on Googlesheets, Dropbox and Google Drive could be captured in the git history easily. So this could serve as a way to archive a non-git workflow into a git workflow after a project is completed.

I fail to see how this could work in an active collaboration. If I work only in Git, how do recent changes (say I made 4 commits on a doc) go back into any of these cloud services? You can't rewrite the history of a Dropbox file (or if you cobbled together a rebase, it would be rather ugly).

But regardless of implementation challenges, I think it's a super cool idea for you to pursue.

noamross commented 9 years ago

@sckott The first use-case that comes to mind is doing collaborative writing on an etherpad or google doc and then moving to a more structured workflow, but wanting to keep history (and authorship info). Lots of writing projects come to mind (e.g., collaborative lesson writing, drafting a paper or paper section, a web graphic built with ArchieML). But they are mostly hypothetical at this point.

@karthik Yes, this is largely a one-way workflow, though you could upload successive commits to Google Docs (git commit ...; mergr upload history ...). I don't think this would be a one-time archive, though. A git user would commit gdoc history repeatedly over the course of a project because it's useful for the git user to be able to restore a project state, including both code and gdoc data. For instance, if your project broke because of a combination of changes in the code and your gdoc, you could back up to where it worked, make changes to the code, and/or restore your gdoc to a previous version. (Automated commit messages could store gdoc/dropbox version numbers for this purpose.)

On Tue, Apr 7, 2015 at 3:52 PM Karthik Ram notifications@github.com wrote:

Hi @noamross https://github.com/noamross I like this idea in principle but foresee a lot of challenges in practice. Correct me if I'm wrong, but in practice this seems like a one way information transfer. Activities on Googlesheets, Dropbox and Google Drive could be captured in the git history easily. So this could serve as a way to archive a non-git workflow into a git workflow after a project is completed.

I fail to see how this could work in an active collaboration. If I work only in Git, how do recent changes (say I made 4 commits on a doc) go back into any of these cloud services? You can't rewrite the history of a Dropbox file (or if you cobbled together a rebase, it would be rather ugly).

But regardless of implementation challenges, I think it's a super cool idea for you to pursue.

— Reply to this email directly or view it on GitHub https://github.com/ropensci/wishlist/issues/12#issuecomment-90752805.

jennybc commented 9 years ago

I am interested in thinking about how Google Spreadsheets, for example, figure into workflows (googlesheets/issues/9). Sounds like you're way ahead of me, so will look forward to seeing what you do.

I have no info on how/if one can really programmatically retrieve historical versions of a Google Sheet. So I'll be interested to learn about that if it's indeed possible.

noamross commented 9 years ago

@jennybc You can definitely get historical versions of a google sheet as exported files with the Drive API, like so:

library(devtools)
install_github('Ironholds/driver')
library(driver)
gap_key <- "1HT5B8SgkKqHdqHJmn5xiuaC04Ngb7dG9Tv94004vezA"
revs = list_revisions(gap_key)
download_revision(file_id = gap_key, version = revs$items[[2]]$id,
download_type = 'csv', destination = 'gapminder.csv')

I also note that you'll get the same if you just add &revision=NUMBER to the CSV exportlink stored in a googlesheets object.

However, so far I haven't found a way to get the cell or row feeds from a previous version of a spreadsheet.

On Tue, Apr 7, 2015 at 5:10 PM Jennifer (Jenny) Bryan < notifications@github.com> wrote:

I am interested in thinking about how Google Spreadsheets, for example, figure into workflows (googlesheets/issues/9 https://github.com/jennybc/googlesheets/issues/9). Sounds like you're way ahead of me, so will look forward to seeing what you do.

I have no info on how/if one can really programmatically retrieve historical versions of a Google Sheet. So I'll be interested to learn about that if it's indeed possible.

— Reply to this email directly or view it on GitHub https://github.com/ropensci/wishlist/issues/12#issuecomment-90765457.

karthik commented 9 years ago

though you could upload successive commits to Google Docs (git commit ...; mergr upload history ...).

This is exactly what I mean by after the fact (either direction) and not for an active collaboration. If I'm working on a Google doc and you on git, you can grab my latest copy, push 4 revisions, then my copy back on top (a poor man's rebase of sorts). But this has potential to introduce so much conflict that cannot be easily resolved.

karthik commented 9 years ago

PS: I'll add the rdrop2 revision feature shortly. Looks like it would be immediately useful to you.

maelle commented 2 years ago

Thank you!

Note that future ideas should go to our wishlist forum category.