joyfullservice / msaccess-vcs-addin

Synchronize your Access Forms, Macros, Modules, Queries, Reports, and more with a version control system.
Other
203 stars 40 forks source link

Implement external diff viewer when comparing changes in conflict resolution #306

Closed joyfullservice closed 1 year ago

joyfullservice commented 2 years ago

I personally use WinMerge for most of my diff comparisons, but I plan to make the module extensible to add other diff tools that others might prefer in their environments.

hecon5 commented 2 years ago

Comment on use of registry: I'd personally prefer to avoid using the registry if at all possible, and keep all settings within either the global setting or the project settings. My environment has been clamping down rather hard on anything manipulating the registry as of late.

hecon5 commented 2 years ago

I've been thinking on the function of this and the intent.

Is the intent to allow the Addin to act as the difference tool, or to better facilitate git (or other VCS) to work better and a more proximal way like other IDEs with "big kid languages"? With git, the incentive is to handle all change control within it; overwriting a local file makes no difference to the repository, and can be undone with a click or command.

Allowing the Addin to bypass git and act as a difference engine reduces the incentive to use git (or other VCS), and is also something that must be managed by us (you), which is overhead we then wouldn't have to develop other features, bug fixing, optimization, etc. It will also cause headaches and false positives with #299 (case Changes thanks to VBA).

I think preferentially, one should export changes, deal with all Versioning outside of Access / Addin (via git), and then only import the changes, preferentially overwriting whatever's in Access.

I'm REALLY excited to not have to do a full build every time, it takes 10-20 minutes to build one of my databases that leverages a large number of linked tables, and that's a considerable time drain. But I'm concerned it will allow users to skip other good practices (like using git to manage development efforts) as a byproduct.

@A9G-Data-Droid mentioned something similar, in that we should use git/external tools to manage the code, but that the importing / building could be greatly accelerated by not importing unchanged bits (looking at linked tables for myself, especially).

We should lay out exactly how this new engine works so users know what to expect, and then know how to leverage it best into their workflow and encourage good code housekeeping methods.

My suggested workflow when Detect Changes is ON

Caveat: This will almost certainly always present false positive comparisons due to case changes.

  1. User has a git repo.
  2. User manages all code outside Access.
  3. Local Development folder code outside Access is the authoritative source (meaning code in the repo is considered what "should" exist).
  4. Merging in features from others into the repo via pull/merge/etc. will change that authoritative source and be the new correct.
  5. If a User then builds (or attempts to build) and the code in the repo isn't what's inside Access notify user if Option is set.
  6. If option is set (check before build), pause build, do an export to the repo (pausing build on screen).
  7. User then commits / changes / updates code in repo to be correct.
  8. Once user clicks "continue build" whatever is in the repo is considered correct, and the build continues like it did before, only building altered components.

Suggested Flow when Detect Changes is OFF

  1. User has a git repo.
  2. User manages all code outside Access.
  3. Local Development folder code outside Access is the authoritative source (meaning code in the repo is considered what "should" exist).
  4. Merging in features from others into the repo via pull/merge/etc. will change that authoritative source and be the new correct.
  5. User exports to repo like before (3.x), manages changes and commits relevant component changes in git outside Access.
  6. When User builds whatever is in the repo is considered correct, and the build continues like it did before, only building altered components.

One thing that will throw errors EVERY TIME is #299. If we don't handle this, the user will ALWAYS be presented with changes, and we don't want that. We DO want seamless. We do want it to be quick. And if we present user with a laundry list of changes every time to be managed, it'll slow down development considerably.

joyfullservice commented 2 years ago

@hecon5 - Thank you for taking the time to think through and flesh out some of these concepts! I really appreciate the thoughtful consideration that you and several others have put into this project to make sure we are not just solving the immediate crisis, but making strategic decisions that we are going to be thankful for years down the road.

I absolutely agree with you that this add-in does not need to replace the functionality of actual VCS systems like git. Those programs have their own dedicated development teams and already do a fantastic job managing a file-based development process with branches, commits, merges, etc...

Most software development projects are managed at the source level where all the changes are happening directly on the source files, and the application is built or compiled for testing and distribution. For these kinds of projects, version control systems and the typical array of developer tools does a fantastic job of managing the development lifecycle.

Microsoft Access is a bit of a different animal in that (in a typical scenario) the actual development is happening on the compiled application level, not the source file level. The developer is visually designing forms, adding VBA code, testing on the fly, etc... This is great from a rapid application development standpoint, and something I love about Access.

What I am attempting to do with this add-in is to help bridge that gap between the compiled application and the source files. In a sense, it is like having two sets of source files. The physical files in the source folder, and the "virtual" source files that represent the actual database objects in their present state. A big part of the performance of the add-in is that it can "know" which objects have changed and need to be exported to bring the source files up to date.

Let's start with the export. In recent versions, the add-in scans the database objects, comparing them with the entries in the index file to see what has changed since the last export. After building a list of changed items, it calls the .Export function of each modified object. In a normal development scenario, this works great.

Now, let's introduce some complexity. The user had recently switched from the main branch to a feature branch in git to review something in the source code, and forgot about that when they started the export. Not knowing any different, the add-in exported six changed files to the feature branch. To make matters worse, those six files actually depend on 10 other files that didn't export because they didn't have any changes from the last export (to the main branch). On top of that, the six files that did export just wiped out several uncommitted changes on the feature branch. Now the user has lost hours of work with no way to undo the damage. That's the situation I am trying to protect against.

Obviously, the user should have been paying attention and not exported into the wrong branch, but the end result is that they lost valuable work with no warning. When they discover the loss, they will probably not have warm and fuzzy feelings about the add-in that just deleted their work.

On the import/build/merge side, it would be even easier to make a mistake. Let's say you just finished building out the layout of the main form, then you get an email that your coworker just finished the widget class they were working on. You merge their changes into your branch on git, and click to merge the source into your database. For some reason the file dates were off, and the main form you were just working on gets replaced by the source file. Poof. All that work is gone with no undo button.

We could look at it and say, well, that's too bad. People just need to be more careful. As developers and admins we work with powerful tools and little mistakes can cause irreversible damage. I get that, but where possible, I would like to provide the safety net that helps avoid the painful mistakes, especially by new users that maybe didn't completely understand how the add-in was designed to work.

The approach I am currently pursuing with this (but welcome feedback and input if there is a better approach) is to leverage the index file to a greater extent in version 4. Looking at the index, the add-in can determine that the file in the source folder is materially different from the one that was last exported. Git isn't going to tell us that. It only knows whether the source file is different from the last commit. We could compare the git head commit hash from the last export, (as we were pursuing with the git integration) but I want to keep the flexibility of supporting other systems like SVN and Mercurial.

The index can also tell us whether the database object has changed since the last export or merge. Again, we aren't going to see that in git because we have not yet exported anything. In version 4, the add-in dynamically builds out the export object and creates a hash from it to see if that matches the hash from the last export/import. For many objects this can be done intelligently by reviewing modified dates or generating JSON content in memory without actually having to perform expensive export operations.

Having an index file that is neither a part of the source files, nor a part of the compiled database is what allows us the two-way change detection that is foundational to providing the user a way to identify and resolve conflicts before inadvertently losing work. It also gives us the information needed to make this as minimally disruptive as possible.

I totally agree that we don't need to slow down the development workflow by popping up extra confirmation dialogs that really aren't needed. This is just a bad idea in general as it trains people to blindly click through confirmations without reading them, and causing them to miss the ones that really are important. My plan is that once a user confirms to overwrite a file or an object, the index is updated and they are not prompted again till something changes.

This is still a work in progress, which is why it is on the dev branch, not in the general release. As I work through the actual scenarios of building out what conflicts actually require user intervention and human decisions, we should see less noise and more actual decisions that require human confirmation. I am happy with the idea of adding an option to turn off the conflict detection, but in my opinion, this should definitely be on by default.

To summarize, here is what I am working to achieve:

Export (Changed files only)

DB Object Source File Export Conflict?
unchanged unchanged no no
changed unchanged yes no
unchanged changed no no
changed changed yes yes
deleted unchanged yes (delete) no
deleted changed yes (delete) yes

Merge (Changed files only)

DB Object Source File Import Conflict?
unchanged unchanged no no
changed unchanged no no
unchanged changed yes no
changed changed yes yes
unchanged deleted yes (delete) no
changed deleted yes (delete) yes

As you can see the above tables, only four of the twelve scenarios triggers the conflicts dialog, and those are the types of conflicts that I would want to know about and make an intentional decision before continuing. This is where the Diff tool comes in. Okay, so the form is modified in both the source file and the database object. What is actually different? The diff link gives you a one-click option to compare the two items so you know what you are about to overwrite.

In the normal development scenario, you really wouldn't see the conflict dialog at all, since the last export should match the index. If you export changes, then switch branches, then merge from the new branch, you shouldn't see the dialog since there are no unsaved (unexported) changes present in the database. In a multi-user scenario, you would see the dialog if you tried to merge changes into the database before exporting your changes on those same objects. Or, if you tried to export on top of the other user's changes before merging them into your database. In both of those cases, I would want to know about that and make an intentional decision about what I wanted to do. (I.e. Overwrite their changes, or cancel the export and merge their changes first.)

This was a very long explanation here, but hopefully it provides a little insight into why I am developing the conflict review functionality to compare between the source files and database objects, and how I envision it working. 😄

hecon5 commented 2 years ago

It does! I have also experienced that "oh bugger, that was not what I wanted to do" as the database is wiped out.

Thankfully, I haven't had that happen too many times, but yes, that is one of the moments you need to step away and get a cup of coffee before you rage quit for the day. Practice, and a self-imposed moratorium of doing work outside the code base except for a specially designated "merge worktree" has reduced that incidence rate, thankfully.

I do think if this is going to be successful, we'll need to figure out how to parse case changes properly, or (at least in my case), false positive change detection will nearly always show collisions. I have seen a couple of ideas, but haven't tackled them in our case yet (so I can't write up a good "how to accomplish this in practice" vice the "white paper proof of concept" writeup I linked in #299. But perhaps someone else has, and that may help everything.

A9G-Data-Droid commented 2 years ago

There are clearly two workflows here.

  1. On Export the goal is generally to make a new commit using the code from the DB project. I don't think I need any checking at this point.
  2. On a build operation you would want to merge the work in the project with the external changes from other developers. This is where things get complicated. The workflow should be to export your code from the DB, commit, perform a pull of any other branches and resolve merge conflicts outside of Access. Then build based on the new code files.

Any situation where there are conflicts in the same branch would come from people working on the same branch. This is what you want to avoid. Each developer should be working in their own branch. The maintainer should be merging all those branches in to a dev or next branch where the next version lives. This is a discipline that needs to be enforced by the lead developer or maintainer. If a conflicting pull request comes in, it needs to be broken out in to a separate branch to allow for the merge to be worked outside of Access. Alternatively, the maintainer could also move their changes in to a new branch, accept the pull and then merge their changes back in. This avoids a situation where you are exporting over changed files.

A9G-Data-Droid commented 2 years ago

Either way you slice it, you should never be working on master\main. Each developer should be working on their own branch and then the final product is build from the merged master. Nobody has a gold copy. Your Access file is just your IDE. This isn't a problem you have to solve with this plugin. It can be solved with a FAQ or Wiki on how to use git with Access.

joyfullservice commented 1 year ago

This has been completed in the latest dev versions.

image