git synchroniztion - Githubissues

colonelpanic8 commented 7 years ago

The possibility of new synchroniztion methods was mentioned way back in 2015 https://plus.google.com/+Orgzly/posts/S2nBbxVibsB

Did anything become of this? Are there plans to add git synchronization support to orgzly?

angrybacon commented 7 years ago

Google Drive synchronization is wanted too.

nevenz commented 7 years ago

You can open a separate issue for Google Drive sync, thanks!

ghost commented 7 years ago

@IvanMalison

When you say git synchronization, were you just talking about committing to a local repo, or also potentially pulling/pushing as part of the sync from that repo to a remote?

I'm just wondering what the full scope of this feature would be.

I like the idea of having the "git client" handle the conflicts during a replay as part of pull-before-push, perhaps adding a new commit with the conflict left in it verbatim, so that I can resolve it later. Like:

 <<<<<<< HEAD:mergetest
 This is my third line
 =======
 This is a fourth line I am adding
 >>>>>>> 4e2b407f501b68f8588aa645acafffa0224b9b78:mergetest

I think this would be a huge improvement from the limited options of force-load/force-save that Orgzly currently provides.

jappeace commented 7 years ago

Shouldn't this be done by a separate program? Or for example make a library from the core features of: https://github.com/maks/MGit and then use those in this project to provide nice integration.

mkaito commented 7 years ago

I use Resilio for phone-to-computer sync. Orgzly-to-sdcard is the kind of auto sync I want. Git might be a good backend for conflict resolution, but staying in sync automatically will outright prevent many conflicts from ever happening.

El 3 de mayo de 2017 7:15:32 GMT+01:00, Jappie Klooster notifications@github.com escribió:

Shouldn't this be done by a separate program? Or for example make a library from the core features of: https://github.com/maks/MGit and then use those in this project to provide nice integration.

-- Sent from my Android device with K-9 Mail. Please excuse my brevity.

angrybacon commented 7 years ago

@mkaito That would prevent the user to edit a file that is not up to date ie. no offline mode.

The thing is that different users want different ways of synchronizing Orgzly with their Emacs. Git is one backend of interest but I think it's more of something like Tasker's job to update the repository. Automatic synchronizing like you suggest makes a lot of sense with Google Drive on the other hand (like Google Drive documents), but then again you would need to handle the cases where the user edit a document while offline.

mkaito commented 7 years ago

I think the constant need to manually synchronize are a bigger concern than the occasional lack of connection.

I'm not saying there won't ever be conflicts, just that most happen when you forget to hit the sync button.

El 3 de mayo de 2017 16:04:03 GMT+01:00, Mathieu Marques notifications@github.com escribió:

@mkaito That would prevent the user to edit a file that is not up to date ie. no offline mode.

-- Sent from my Android device with K-9 Mail. Please excuse my brevity.

angrybacon commented 7 years ago

@mkaito Well then pick another sync backend than Git :-)

mkaito commented 7 years ago

I just said that using git would make merges a lot more painless when conflicts do happen.

El 3 de mayo de 2017 16:33:56 GMT+01:00, Mathieu Marques notifications@github.com escribió:

@mkaito Well then pick another sync backend than Git :-)

-- Sent from my Android device with K-9 Mail. Please excuse my brevity.

jappeace commented 7 years ago

So I set this up with MGit and then use the file system to synchronize, this works ok-ish although the workflow is a little cumbersome because you have to switch apps to commit and push and pull. What currently happens is I press synchronize and orgzly writes out to the MGit repo (I assume that before that the changes exist inside the app assigned memory). Then I do git stuff with MGit.

So what is expected of this issue precisely?

Do you want to auto detect we have a git repo and when you press synchronize automatically commit push and pull just like the github app on windows?
Do you want to make a MGit like UI inside orgzly?
Something else?

Personally I'd prefer an auto synchronize thing, and let the advanced stuff be done with an app like MGit.

You thoughts please.

mkaito commented 7 years ago

This issue is about automating what the "synchronize" button does. Nothing else. Git came up as a backend system to use for conflict resolution, not as an additional app that you manually use.

El 9 de mayo de 2017 16:30:20 GMT+01:00, Jappie Klooster notifications@github.com escribió:

So I set this up with MGit and then use the file system to synchronize, this works ok-ish although the workflow is a little cumbersome because you have to switch apps to commit and push and pull. What currently happens is I press synchronize and orgzly writes out to the MGit repo (I assume that before that the changes exist inside the app assigned memory). Then I do git stuff with MGit.

So what is expected of this issue precisely?

Do you want to auto detect we have a git repo and when you press synchronize automatically commit push and pull just like the github app on windows?

Do you want to make a MGit like UI inside orgzly?

Something else?

Personally I'd prefer an auto synchronize thing, and let the advanced stuff be done with an app like MGit.

You thoughts please.

-- Sent from my Android device with K-9 Mail. Please excuse my brevity.

jappeace commented 7 years ago

So when the user presses the synchronize button, git will commit local changes, pull in remote changes and push local changes. If everything goes according to plan, otherwise panic and let the user figure it out?

mkaito commented 7 years ago

We could figure out what specifics work best. Perhaps stash, pull, stash pop, commit and push? Or commit, fetch, rebase?

El 9 de mayo de 2017 17:20:42 GMT+01:00, Jappie Klooster notifications@github.com escribió:

So when the user presses the synchronize button, git will commit local changes, pull in remote changes and push local changes. If everything goes according to plan, otherwise panic and let the user figure it out?

-- Sent from my Android device with K-9 Mail. Please excuse my brevity.

jappeace commented 7 years ago

I wouldn't want to do the stash approach, because I think it will prevent conflict detection, and I do believe on a conflict the user should be consulted, rather than choosing for the user. I honestly don't know what happens if a conflict occurs during a stash apply, does it just overwrite?

Then I think automatic merges are better than rebases, because a merge is non-destructive. At least according to the tutorial I just read on rebases. We can also not know if upstream is a public branch or not, therefore committing + merging is the correct solution I believe.

colonelpanic8 commented 7 years ago

Then I think automatic merges are better than rebases

yes, definitely. automatic rebases are a bad idea

alphapapa commented 7 years ago

Then I think automatic merges are better than rebases, because a merge is non-destructive. At least according to the tutorial I just read on rebases.

Merges are non-destructive in the sense that they don't alter commit history, but if a merge doesn't apply cleanly, the merge commit is potentially destructive if the user doesn't resolve the conflicts correctly.

A rebase is destructive in the sense that it alters the commit history of the branch that's rebased, but a rebase that applies cleanly is non-destructive to the data within each commit.

colonelpanic8 commented 7 years ago

@alphapapa

Merges are non-destructive in the sense that they don't alter commit history, but if a merge doesn't apply cleanly, the merge commit is potentially destructive if the user doesn't resolve the conflicts correctly.

Nope, its never destructive, because EVERYTHING that was there before is still there (i.e. the commits exist in their original context). If the user doesn't resolve conflicts correctly, its easy to go back to the states before the commit and perform the merge again.

A rebase is destructive in the sense that it alters the commit history of the branch that's rebased, but a rebase that applies cleanly is non-destructive to the data within each commit.

Hmmm. define cleanly. If you take the above scenario I mentioned (i.e. a conflict resolution takes place, but it ends up being incorrect, it is actually impossible to recover the original commit in its original context). The conflict resolution isn't even recorded as conflict resolution (as it is with merges), but as part of the existing commits.

If by cleanly you mean "without dropping into conflict resolution", even that is not really true. Git IS NOT a patch oriented VC, which means the "contents" of a commit is actually the entire state of the repository at that time, not something like a patch file that you might apply to a repository. Git can sort of make it seem liek this is what it stores when you look at diffs, but this is not the case.

alphapapa commented 7 years ago

Nope, its never destructive, because EVERYTHING that was there before is still there (i.e. the commits exist in their original context). If the user doesn't resolve conflicts correctly, its easy to go back to the states before the commit and perform the merge again.

Yes, in the context of the entire commit history, it's non-destructive. But in the context of the merge commit itself, it is destructive. And especially in the context of Org files, it's easy to delete data and never notice it's missing.

Resolving conflicts on a smartphone screen is likely to be error-prone. Especially consider cases where the user may be in a low-visibility or unstable environment, like in a moving vehicle, in direct sunlight, etc: it's not difficult to imagine a scenario where a user accidentally taps the wrong hunk, resolves the conflict, and never notices that the wrong one was selected. In such a case, having the commit history isn't going to save him--at least, it's not guaranteed to, because first he has to notice the data is missing.

This has actually happened to me a few times: data seemed to be missing from an Org file, and I was thankfully able to dig it out of a git commit from months earlier. But who knows if that's happened without my noticing it. Even so, finding the needle commit in a haystack of thousands of commits over months or years is not an easy task, even with gitk.

The bottom line for me is that doing git merges on a smartphone or tablet is probably a bad idea in general; and it's an especially bad idea for anyone not already familiar with git merges. If someone wants to implement such a feature for "advanced users," cool, but I think it would be a mistake to promote it to "normal" users.

Hmmm. define cleanly.

A clean rebase is one that git processes automatically, without conflicts. Or, in git terms, a fast-forward.

If by cleanly you mean "without dropping into conflict resolution", even that is not really true. Git IS NOT a patch oriented VC, which means the "contents" of a commit is actually the entire state of the repository at that time, not something like a patch file that you might apply to a repository. Git can sort of make it seem liek this is what it stores when you look at diffs, but this is not the case.

Git is not darcs in that it fundamentally does not operate on patch theory, but it certainly is patch-oriented for some operations. Git was designed to work with patches from the beginning, as it was designed for the kernel and the LKML. It simply generates patches dynamically when needed, or uses them when they're given.

jappeace commented 7 years ago

But in the context of the merge commit itself, it is destructive

So the way I interpreted non-destructive is 'loss of information'. Sure you can fuckup the merge commit badly, but you didn't loose any information (can always go back with a checkout and try again). From the article I linked:

Merging is nice because it’s a non-destructive operation. The existing branches are not changed in any way.

Resolving conflicts on a smartphone screen is likely to be error-prone.

My idea was to just not even support this. If a conflict occurs, we panick and roll back. We only accept fast forwards and automerges.

The bottom line for me is that doing git merges on a smartphone or tablet is probably a bad idea

I guess you mean the manual stuff, with which I agree, you shouldn't do that on the phone.

Alternatively on a panick you could offer to push local changes to the remote on a branch. This allows user to merge manually on a better suited device.

colonelpanic8 commented 7 years ago

Git is not darcs in that it fundamentally does not operate on patch theory, but it certainly is patch-oriented for some operations. Git was designed to work with patches from the beginning, as it was designed for the kernel and the LKML. It simply generates patches dynamically when needed, or uses them when they're given.

Sure, that doesn't change my point, which is that when it comes down to it, EVEN cleanly applied rebases ARE destructive, and because of the way org works, I can imagine scenarios where a cleanly applied rebase could have undesireable results (think subheadings etc.), which would make it important to have the exact original context for a change.

The bottom line for me is that doing git merges on a smartphone or tablet is probably a bad idea in general; and it's an especially bad idea for anyone not already familiar with git merges. If someone wants to implement such a feature for "advanced users," cool, but I think it would be a mistake to promote it to "normal" users.

I don't think anyone was advocating this sort of thing. @jappeace and I were simply suggesting that going with merges here would be better than going with rebases.

alphapapa commented 7 years ago

We find ourselves in violent agreement. :)

aspiers commented 7 years ago

Great discussion all! I have one minor correction, and a suggestion. Firstly the correction:

A clean rebase is one that git processes automatically, without conflicts. Or, in git terms, a fast-forward.

Sorry, but that's not quite right. In the git world, fast-forwarding implies that new commits are not being created, but it is possible, and in fact more common, that rebases will happen cleanly without conflicts and yet still create new commits. Actually it doesn't really make much sense to talk about fast-forwards in the context of rebasing; it's usually more applicable to merges and cherry-picks.

Secondly, the suggestion:

I agree that automatic syncing would drastically reduce the frequency of conflicts. But automatic syncing is not mutually incompatible with using git - you can have your cake and eat it! Have any of you guys looked at the git-annex assistant? That's exactly what it does, and it even has an Android version ...

colonelpanic8 commented 7 years ago

the git-annex assistant?

I don't think git annex is a good solution since it doesnt keep the actual content of the files in the git repository. Since we ARE dealing with text files and not files that are too large to handle a simple automatically sychronized repository shoudl be good enough (basically https://github.com/hbons/SparkleShare)

colonelpanic8 commented 7 years ago

I'm going to take a swing at this.

One issue that I forsee from looking at the existing sync code is that its not going to be super easy to batch up changes to multiple files in a single commit because there is no "finalize synchronization action". I'm assunming this is probably fine.

nevenz commented 7 years ago

I'm going to take a swing at this.

Nice!

One issue that I forsee from looking at the existing sync code is that its not going to be super easy to batch up changes to multiple files in a single commit because there is no "finalize synchronization action".

Yeah, though doing a single push would be nice.

colonelpanic8 commented 7 years ago

@nevenz I'm thinking about maybe making a change to the repo api to allow synchronization to be done more safely with git. I want to ensure that synchronization is only ever performed on top of the appropriate revision of a previous version of the file, so I would like the storeBook method to also take as an argument the previous revision. Does this sound resonable?

colonelpanic8 commented 7 years ago

@nevenz Another issue I'm having is that I would like to allow the user to specify a location on disk in which to store the git repo IN ADDITION to the actual uri that will be used to indentify the repo (which will also be used as the uri that is stored on the book). This doesn't fit in well with the existing system of having exactly 1 uri for each book.

nevenz commented 7 years ago

I'm thinking about maybe making a change to the repo api to allow synchronization to be done more safely with git. I want to ensure that synchronization is only ever performed on top of the appropriate revision of a previous version of the file, so I would like the storeBook method to also take as an argument the previous revision. Does this sound resonable?

So what the plan in the end for conflict resolution? This suggest you'd always call storeBook, even on (what current repos types would call) a conflict?

Another issue I'm having is that I would like to allow the user to specify a location on disk in which to store the git repo IN ADDITION to the actual uri that will be used to indentify the repo

Why, does he care about that directory? I don't think he should, that sounds like "implementation detail" to me. It can be just a checkout in app's cache directory.

Depends on implementation I guess - if we can start with listing all the steps in different scenarios, that would be great.

tulth commented 7 years ago

I think a new repo class for git would be neat to add someday. It's unclear to me where the responsibility for merging master would live

colonelpanic8 commented 7 years ago

So what the plan in the end for conflict resolution? This suggest you'd always call storeBook, even on (what current repos types would call) a conflict?

@nevenz

No. Here is the rough idea:

Retrieving/syncing from the repository:

Every time we update/sync a "book" that is connected to a git repository, we store 2 things:
- the SHA1 of the blob of the associated file
- the SHA1 of the commit at HEAD
As we're doing the update, we ensure that the previous commit SHA1 is in the repository history that we are syncing with. If it is not, we will display a dialog warning the user about data loss, but allowing them to perform a "force upgrade" if they want to.

Writing data from orgzly to the repository:

Check if the SHA1 of the book's file blob is the same as it was when we last synced, and that there are no changes in the working tree/index. NOTE: We DO NOT do try to fetch any new changes first.
If these conditions are met (they always will be provided that no other program messes with the repository) we can simply apply the update that we want to apply and make a new commit. Note that this is much more permissive than using the SHA1 of the commit since it will allow changes to other files or meta data in the repository.
If the SHA1 of the book's file blob has changed. We make a new temporary branch at the commit at the time of last sync and apply our changes.
Then we attempt a merge with what was in the repository.
If there are conflicts in the merge, the user must manually resolve the conflicts. We will give them two options for how to proceed:
- a) Abort the merge, but keep the commit that was made with the new orgzly changes, and push it to a new branch of the users choosing on the remote. This will allow the user to resolve conflicts on a computer and then push them back to a branch as appropriate. We will need some type of branch selection interface to allow the user to return to their master branch at some point, as with this option orgzly will now be syncing with a different branch.
- b) Leave the branch in a conflicted state. It will be up to the user to use some program on their device (MGit) etc. to resolve the conflicts and do all the other git things that need to be done to restore orgzly's ability to sync.
Only now will we fetch from the remote repository (assuming that we are now in the happy case of synchornization) we attempt another merge, using the same procedure as above if there are merge conflicts.
Finally we push to the remote.

This whole procedure is designed to do three things:

a) Only ever safely apply updates from orgzly and vice versa, making data loss impossible b) reduce, as much as possible, the possibility of merge conflicts c) Make it as easy as possible to resolve merge conflicts when they do arise

Why, does he care about that directory? I don't think he should, that sounds like "implementation detail" to me. It can be just a checkout in app's cache directory.

Because, some people will want to perform git operations (like conflict resolution) using applications other than orgzly.

tulth commented 7 years ago

I am thinking along the same lines and I am working a (non git) hash based file sync check. See pull request #172

colonelpanic8 commented 7 years ago

I think a new repo class for git would be neat to add someday. It's unclear to me where the responsibility for merging master would live

@tulth @nevenz Do you have any idea/preferences about how I should go about storing both the SHA1 of the blob and the SHA1 of the commit, I can do some things like come up with some nasty way of encoding that in the revision string, but Ideally I'd like to do something else.

I feel like the repo/rook/versioned rook classes may be slightly overfitted to the existing repo implementations (Dropbox, Directory). Perhaps the best approach would be to not try to fit the details of all the different sync implementations with these classes? I'd love to hear your thoughts on this.

I am thinking along the same lines and I am working a (non git) hash based file sync check. See pull request #172

How do you plan on handling conflict resolution without git? There's not really a ton you can do there right?

tulth commented 7 years ago

How do you plan on handling conflict resolution without git? There's not really a ton you can do there right?

It would be conflict detection and reporting... not resolution. It detects conflicts by: test1) book newer than the last repo sync test2) current calculated hash mismatched the hash stored in db

if !test1 && !test2: nothing to do, already in sync if !test1 && test2: repo newer, need to read repo and write to db if test1 && !test2: book newer, need to write book to repo if test1 && test2: conflict

edit: fixed logic typos

colonelpanic8 commented 7 years ago

The thing is that different users want different ways of synchronizing Orgzly with their Emacs. Git is one backend of interest but I think it's more of something like Tasker's job to update the repository. Automatic synchronizing like you suggest makes a lot of sense with Google Drive on the other hand (like Google Drive documents), but then again you would need to handle the cases where the user edit a document while offline.

@angrybacon

I strongly disagree here. Having to a) buy tasker b) Set up a sync job that involves BOTH orgzly and tasker and mgit c) Configure the directories and repository, ssh keys appropriately across both

is always going to be more cumbersome and flaky than having in application synchronization that can sync at critical moments (i.e. after a file is touched, when orgzly is opened), display sync status in application (current SHA last time of update)

Personally I'd prefer an auto synchronize thing, and let the advanced stuff be done with an app like MGit.

@jappeace I really don't understand why you'd want this. By having the git In app, you avoid having to separately go in to mgit to push/pull etc., and you avoid all the cognitive load of the different levels of syncing (copy to working tree, commit, push/pull etc.). Theres just so many place where things can go wrong if you dont have it controlled in one place. Furthermore, by having the git synchorniztion in app, you can do things like record the sha of last synchronization, and make sure that you are only applying changes to the exact file state that you last synchronized with. Without this guarantee, there are a bunch of ways to accidentally clobber/lose information.

colonelpanic8 commented 7 years ago

@tulth Where exactly are you planning to implement these checks in the code base.

The current Repo methods do not receive the VersionedRook where it is needed. I actually have a WIP branch where I have added a VersionedRook paramter to storeBook, renameBook and retrieveBook. Perhaps I should make a separate pull request with just this change so that we can avoid making conflicting changes to the Repo interface?

EDIT: Hmm, it seems that you are using the CurrentRooksClient

nevenz commented 7 years ago

It would be good to start from existing Repo interface and see what we need for git. We can improve it or change it as much as we want. I'd just like to avoid a rogue implementation.

Currently, the logic for what needs to be synced is the same for all repo types. It's in updateStatus. It was done so storeBook and retrieveBook can be dumb and just do exactly that.

But if we need a repo-specific decision-making, we could perhaps create a new interface for that. A standard implementation could be created, which repos would return by default, but they'd also be free to create their own. Something along those lines.

I'd suggest trying to implement Repo for start and see how far we can go with that. It'll be easy to move the code around later.

I'm not sure I understand the need for all the things @IvanMalison mentioned when describing the steps (thanks for that BTW), though that could be just by Git ignorance. I have to go through all the comments in this issue one more time.

Personally I'd probably go for:

Pull master
Commit changed notebooks to local orgzly branch
Merge orgzly to master (complain if it fails)
Import all changed notebooks
Push master

Using blob hash a revision and trying to implement Repo. That would be a great start.

There'd be issues popping up all over the place probably, but it would be easier to see what we need exactly and how to keep it as simple as possible.

BTW, as for the additional configuration needed for Git (directory, credentials, etc.) - it's not really Git-specific, many repo types will have to have more things then just the URL. This data can be stored in DB or properties, for each repo.

One thing to have in mind (after seeing properties in #173) - there could be multiple git repositories defined by user. One private, one public, one for worg for example. ;) So this needs to be handled too.

I'd probably use DB, even though there's a lot of boilerplate code when doing so. But some tables could be shared between different repository types. For example table with user/pass, or for ssh keys etc.

colonelpanic8 commented 7 years ago

@nevenz

It would be good to start from existing Repo interface and see what we need for git. We can improve it or change it as much as we want. I'd just like to avoid a rogue implementation.

Yeah, that is what I'm in the process of doing. I started by writing the git synchronizer which will be used by this repo class.

But if we need a repo-specific decision-making, we could perhaps create a new interface for that. A standard implementation could be created, which repos would return by default, but they'd also be free to create their own. Something along those lines.

The main things we need for git are:

a) a way to store directory in addition to git uri b) a way to store revision hash in addition to blob hash c) the ability to sync back changes after uploading (since a merge could have occurred) d) A solution for pushes that avoids pushing after every single file change (although this could be delayed for now)

Personally I'd probably go for:

Pull master Commit changed notebooks to local orgzly branch Merge orgzly to master (complain if it fails) Import all changed notebooks Push master Using blob hash a revision and trying to implement Repo. That would be a great start.

This is basically what is being done, but we are also storing blob hashes to avoid merges in more cases.

Also, this sync scheme is MUCH more different than the existing one - notice how many things happen at the start of synchronization and then at the end of synchronization.

Repo simply would not work for the procedure you are describing here (how do you return the versioned rook for the first file you've imported when you havent made a commit yet for ex.)

One thing to have in mind (after seeing properties in #173) - there could be multiple git repositories defined by user. One private, one public, one for worg for example. ;) So this needs to be handled too.

The properties I added were for git author, and ssh key, not for directory location. Do you think people will want to add multiple ssh keys and have different git authors per repo. Perhaps that is something we can do eventually, but I don't think it's something we need for v1.

nevenz commented 7 years ago

a way to store revision hash

This is the part I didn't get - what do we need this again? And can we do without it for now?

Also, this sync scheme is MUCH more different than the existing one - notice how many things happen at the start of synchronization and then at the end of synchronization.

We can start with having onStart and onEnd kind of methods in Repo.

Repo simply would not work for the procedure you are describing here (how do you return the versioned rook for the first file you've imported when you havent made a commit yet for ex.)

But revision is available when you add file to index, which should be enough to create VersionRook. Unless I'm misunderstanding something.

Anyway, I do realize there will be need for some major changes, I'm just kinda trying to push this to its limit, then go back from there.

The properties I added were for git author, and ssh key, not for directory location. Do you think people will want to add multiple ssh keys and have different git authors per repo. Perhaps that is something we can do eventually, but I don't think it's something we need for v1.

Yeah, we'll definitely need everything stored separately per repo.

I'm fine with one step at the time of course, but the problem can be not breaking user's setup later. For example if we start by storing email as git_email in preferences, what do we do in v2 when we support multiple repos. It's why it might be necessary to leave some room for that right from the start. Which might end up being very close to completely supporting it.

We can have a SharedPreferences per repo, something like repo.ID.xml?

tulth commented 7 years ago

Right now I have the repo grab a global app context to get the content provider. I feel like this is ok because the sqldb and files are both global resources on the filesystem... no difference really.

colonelpanic8 commented 7 years ago

This is the part I didn't get - what do we need this again?

Lets say that the state of the repo has been changed such that the sha that we have stored for some book is not the sha of that file in the git repository. If we don't store the original revision that we got the file state from, we basically have no way to continue, because we need to apply our update on top of a comitt which has the appropriate state for the file.

You might argue that "The git repo should never get in to this state.", but I would like this repository sync to be as robust as possible.

Another reason to keep the repo string around is that when we are syncing back from the repository, we want to make sure that the last synced revision is a parent of the commit that we are syncing.

We could get by with ONLY the repository sha (i.e. without the file contents sha), as we could always just use this technique of restoring to an old state, but this would require almost every commit to be a merge commit (and would cause more issues with rebasing).

I'm fine with one step at the time of course, but the problem can be not breaking user's setup later. For example if we start by storing email as git_email in preferences, what do we do in v2 when we support multiple repos. It's why it might be necessary to leave some room for that right from the start. Which might end up being very close to completely supporting it.

We can have a SharedPreferences per repo, something like repo.ID.xml?

I'm not super familiar with android programming -- how do you create such templated xml files?

Would we have to make a new view to configure all of this for each git synced repository? Where should this live in the settings menu?

One of the sort of thorny issues is that these settings are not actually per book, but per repository, so they could be associated with multiple books.

colonelpanic8 commented 7 years ago

@nevenz Another issue we need to deal with is the fact that there is an assumption of uniqueness of uris in the CurrentRooksClient. This will not apply to git repositories as my plan is to store relative (to the root of the git repository) uris in the rook.

Would you prefer that absolute paths were stored here? That just ends up not being very tidy with the existing interfaces.

colonelpanic8 commented 7 years ago

@nevenz As far as storing the file/commit sha, I guess that perhaps the best thing to do is to store the commit sha, because the file sha can always be retrieved from that...

colonelpanic8 commented 7 years ago

@nevenz Can you explain why there is inconsistency in the interface for retrieveBook and storeBook?

Why does storeBook take a fileName, but retrieveBook takes a uri?

colonelpanic8 commented 7 years ago

@nevenz Don't we theoretically need to have rename work both ways? What happens when there is an upstream rename?

Ideally, I think there should actually be no rename at all...

nevenz commented 7 years ago

I'm not super familiar with android programming -- how do you create such templated xml files?

Check getStateSharedPreferences. It creates states.xml. dropboxToken is using it for example.

Would we have to make a new view to configure all of this for each git synced repository?

Yes, like DirectoryRepoFragment for example.

One of the sort of thorny issues is that these settings are not actually per book, but per repository, so they could be associated with multiple books.

Right, I was suggesting repo..xml. And we'd have gitAuthor(repoId, name).

It would be a generic storage for repo implementations which they could use.

@nevenz Another issue we need to deal with is the fact that there is an assumption of uniqueness of uris in the CurrentRooksClient. This will not apply to git repositories as my plan is to store relative (to the root of the git repository) uris in the rook.

I was planning to remove current_versioned_rooks. If you check where CurrentRooksClient is being used - it's in tests only. book_syncs is used though. Is that OK for this implementation?

Would you prefer that absolute paths were stored here? That just ends up not being very tidy with the existing interfaces.

uri in Rook (so in turn, VersionedRook) is absolute, so if we want to keep using the same interface, it will have to remain absolute I think.

@nevenz As far as storing the file/commit sha, I guess that perhaps the best thing to do is to store the commit sha, because the file sha can always be retrieved from that...

Well if we're keeping the common check for changes, we'll have to have file's revision. Also, revision is visible to the user currently, it can be useful for at least debugging.

@nevenz Can you explain why there is inconsistency in the interface for retrieveBook and storeBook?

Why does storeBook take a fileName, but retrieveBook takes a uri?

Yeah, it's messy. I remember changing that interface few times for various reasons.

I guess we should prefer relative names here, since we have repo's uri already.

@nevenz Don't we theoretically need to have rename work both ways? What happens when there is an upstream rename?

It's not detected currently. Renamed book would be imported, existing one in app would be stored, so you'd end up with two. It just hasn't been implemented yet. Same with deleting of remote book.

colonelpanic8 commented 7 years ago

Check getStateSharedPreferences. It creates states.xml. dropboxToken is using it for example.

@nevenz Yeah but dropboxToken is not per repo afaict, why have ssh keys/authors etc be per repo in that case?

colonelpanic8 commented 7 years ago

@nevenz We are going to need a way to dispatch repo creation on more than just uri scheme, since git repos can come in a bunch of different flavors.

How would you like to approach this?

nevenz commented 7 years ago

Yeah but dropboxToken is not per repo afaict, why have ssh keys/authors etc be per repo in that case?

It should haven been - Dropbox doesn't support more then one account easily, unless you're a Business user, so I didn't bother (I think). Not to mention is was the first repo I've implemented and was writing the app just for myself.

Do you think it'll be an issue, or you're just asking?

We are going to need a way to dispatch repo creation on more than just uri scheme, since git repos can come in a bunch of different flavors.

Hmm yeah. I guess type of the repo will have to be saved together with uri in repos table. Then (in worst case) queried in getFromUri. (And ideally got earlier together with uri.)

If you don't want to mess with that, just try to figure it out from the url, which should work in most cases -- check for .git prefix first, then do schema checks. And I'll see what can be done to avoid that guessing.

colonelpanic8 commented 7 years ago

@jgkamat @Endi1 @ebpa @a13 @pawel-zet @tbuschmann @jappeace @gheoan @arrcher @btimofeev @andrewsuzuki @realloc I'd love some help testing #173. Its still pretty raw atm, but it should mostly work.

nelljerram commented 7 years ago

I'm super-happy to have found that this discussion and work is happening here! Thanks to @IvanMalison for working on this, and to @nevenz and everyone else for contributing to the discussion.

I'm new to Android development, but will try to make my way round to being able to test #173 ... For now, I just wanted (1) to comment on:

Lets say that the state of the repo has been changed such that the sha that we have stored for some book is not the sha of that file in the git repository. If we don't store the original revision that we got the file state from, we basically have no way to continue, because we need to apply our update on top of a comitt which has the appropriate state for the file.

That sounds equivalent to orgzly having its own Git branch in the git repo on the phone. I'm inclined to doubt that it will be a good thing to reinvent this, or that there is a good reason to distrust the local repo state.

And (2) to provide the script that I'm currently using to do this kind of thing by hand, in case it is useful:

# -*- shell-script -*-

# Ensure that we're on the phone's branch, and get its SHA-1.
git checkout phone
local_phone=`git rev-parse HEAD`

# Commit any new changes on the phone.
git add -A .
git commit -m "Changes made on phone"
echo INFO: Committed phone changes to local Git

# Update tracking branches from the Git server.
git fetch --all

# Determine whether the phone branch on the server has been modified
# since it was last pushed from here.
remote_phone=`git rev-parse origin/phone`
if [ $local_phone != $remote_phone ]; then
    # Phone branch on the server has been modified.  We need to merge
    # that into our own phone branch.  This should always succeed,
    # because the phone branch should only be modified elsewhere (than
    # on the phone) when there has been a conflict, and in that case
    # the user should take care not to make any further changes on the
    # phone until the conflict has been resolved.  The conflict should
    # have been resolved by adding commits, on the phone branch, to
    # the last point that was pushed from this phone, and so this
    # merge now should be a simple fast forward.
    if git merge origin/phone; then
    echo INFO: Merged phone branch conflict resolution from Git server
    else
    echo ERROR: Could not merge phone branch conflict resolution from Git server
    git merge --abort
    exit 1
    fi
fi

# Push the current phone branch to the server.
git push
echo INFO: Synced phone branch to Git server

# Try to merge any changes that there may be in the master branch -
# which is where any other devices make their changes.
if git merge origin/master; then
    git push
    echo INFO: Merged non-phone changes into phone branch

    # That worked, so also merge the phone branch back into master.
    git branch -D master
    git checkout master
    if git merge phone; then
        git push
        echo INFO: Merged phone branch back into master
    else
        echo ERROR: could not merge back into master
        git merge --abort
    fi
    git checkout phone
    git branch -D master
else
    # The phone and master branches are in conflict.  This is a pretty
    # normal scenario when updates are being made on multiple devices.
    # We abort the merge and flag the situation, and the user should
    # resolve it by:
    # - probably switching to a more capable device (laptop), for
    #   convenience
    # - doing the same merge on that device (git checkout phone; git
    #   merge master)
    # - manually resolving the conflicts
    # - committing and pushing the merge with those resolutions
    # - coming back to this phone, and running this script again,
    #   before making any further changes on the phone.
    echo WARNING: Conflict between phone and master branches
    git merge --abort
fi

orgzly / orgzly-android

git synchroniztion #24