openwebwork / webwork2

Course management front end for WeBWorK
http://webwork.maa.org/wiki/Main_Page
Other
145 stars 165 forks source link

Keeping release/2.9 and develop up to date with master? #304

Closed mgage closed 10 years ago

mgage commented 10 years ago

Notice that develop and release/2.9 are somewhat behind master. It's possible that discrepancy is only non-essential merge commits. How do we handle that?

https://github.com/openwebwork/webwork2/branches

dpvc commented 10 years ago

This is one of those times when hot fixes to master are also supposed to go to other branches. They should go to develop and to release/2.9 in addition to master. If we don't do that, this kind of discrepancy is the result.

It looks like this is the set of changes to master that are not in release/2.9. There are some significant changes.

The changes to master that aren't in develop are here. This is less far behind because Geoff's hotfix branch from master on January 8th was merged into his bigfix branch on the 21st, which was just now merged into develop as pull request #297. So that means develop is up-to-date with masteras of the 8th (whereasrelease/2.9is only up-to-date withmaster` as of December 19th).

I'm not sure what the best solution is for this. One could rebase the changes from master to the other branches, but it might just be easier to merge master back into the other two this once, and try not to get into this situation again in the future. I suppose the "correct" solution is to rebase the changes to a new hotfix branch, then remove them from master, and then merge them into master, develop, and release/2.9. But it is tricky, especially now that there are other branchings. It would need to be handled carefully. I've recently had to do something like this for MathJax, but not on this scale. I could give it a try if you want me to.

mgage commented 10 years ago

rebasing hides the history -- so while the result is prettier it's also hiding information. Just merging allows the information to remain so you can untangle it if you absolutely need to. Opinion seems to be pretty divided on what is the correct approach -- at least the last time I checked on this.

Part of the problem with submitting the hotfixes to both master and develop is this: I pull master and develop a hot fix for that. Then I issue a pull request on my fix to master. After that I issue a pull request on this same software repo to develop. That has the effect of pulling any changes that are in master into develop. How is this different from merging master into develop (which current wisdom says is to be avoided)?

One could ask the bug fixer to pull a fresh copy of develop, fix the bug there, and then submit but that is (1) asking a lot and (2) allows them to fix the bug two slightly different ways which will cause puzzling merge conflicts.

We need a more detailed procedure of how to handle this on github.com

I've used cherry picking on my own machine to get this done -- but that won't work on github and is not particularly transparent in any case so I try not to do it too often.

dpvc commented 10 years ago

You are correct that a hotfix branched from master and merged to both master and develop effectively merges master into develop, but I think the reason for the "no merging of master into develop" edict is to prevent changes to master that are not also included in develop; that is to avoid the problem we are currently having. As long as there are never changes to master that aren't in develop, the side-effect of merging master into develop is not a problem. I guess it is partly a mater of aesthetics; having a hotfix from a branch of master back to master and to develop makes the purpose and effect of the hotfix branch clearer.

--- A --- C  master
     \   /
       B     hotfix
        \
--------- D  develop

rather than

--- A --- B      master
           \
------------ D   develop

I think the first looks more like B is a fix for A, and also it needs to be incorporated into the develop branch than the second does.

dpvc commented 10 years ago

rebasing hides the history

Can you be more explicit about what you mean? My experience with rebasing is that the commit messages, userids and so on are retained. What part of the history is it that you are seeing being lost? Just curious.

mgage commented 10 years ago

My understanding is that rebasing is often used in the following way. I create the branch feature1 from branch1. I work on feature1. Other people have made changes to branch1. I can merge branch1 into my ongoing feature branch1 periodically and eventually merge it back into branch1. If I've understood it correctly my feature1 loop will then have my commits on my project interspersed with updates coming from other modifications to branch1. I think it's this interleaving of relevant commits and irrelevant commits that rebase method is trying to avoid.

An alternative is not to pay any attention to the ongoing changes in branch1 until my feature is complete -- but for longer projects I think this is not a good idea.

Another alternative is to periodically rebase feature1 off of branch1 (instead of mergin) -- this creates a false history in that it looks like my feature branch started at a time much later than the actual date. That's what I mean by changing history. You might be able to figure true history out by looking at the time stamps of the actual commits. I'm not sure. If I rebase just before merging my feature1 into branch1 then it looks like I branched, created a feature and merged back in a short period of time. This is quite clean in that you can look at the loop and see exactly what commits are relevant to that feature -- and yet I haven't been developing in a vacuum -- I'm aware of potential conflicts with ongoing development as I work rather than at the very end. This seems to me very nice story and a good reason to use rebase on your own laptop while developing. Apparently there are also downsides so rebase needs to be used carefully. In particular if anyone has a connection to your feature1 branch it will be completely messed up when you rebase. Not sure I understand all the details or ramifications to this.

dpvc commented 10 years ago

OK, I understand your point now. The "pro" is that the commits are all together in the log, but the "con" is that this doesn't reflect the time sequence of the actual development.

I think that if we do the branching and merging properly, we can get the best of both: all the commits for a particular feature are on the feature branch, so are "separate" in that sense, and appear in the network diagram along the line of the branch. If branches are merged either via pull requests in GitHub or with the git merge --no-ff <branch> command, then those lines of commits should be maintained as separate lines that indicate the different branches. Granted, that's not currently the case, but I think it can be for the future.

For example, here is a section of MathJax's network diagram:

mj-git copy

These actually represent bug fixes more than features, but you can see that each "loop" represents a collection of related commits. And even when changes on different branches are interspersed in time, they are still grouped by the branch. That is, you can see that the seven commits on the green branch are all related, even though other changes are merged in between (and would be interleaved with them in the log). Also, the node at the tip of the green arrow (as it is merged into the blue branch) should list the commit history for the green branch only (and would not include the orange or red commit message). So it seems that there are a number of ways to get the log along a particular branch without needing to use rebase.

So my recommendation would be not to rebase, but rather to use good branching and merging (without fast-forward).

mgage commented 10 years ago

That works fine for short loops, but there is another use case for long features -- such as Peter's.
These are rarer so perhaps we can just set different rules for them.

First question. Is this is the kind of workflow that you envision? I've seen it recommended as best practices. Would you recommend not pulling in changes from develop until your feature is done? This does not seem to be a good idea to me. Is there some third alternative?

Second question. Is the loop for this feature going to be confusing because it contains the merge commit g (that pulls in all of the file changes from B, C, D -- does it pull in the commits as well?) Will the commits to B, C, D appear as if they are part of the feature you are building? will they appear commit f and commit h? I think the network on github will probably be clear but will the logs be confusing?

We should think now about how we are going to merge Peter's ui features and Geoff's mathpet branch. Those branch points were some time ago.

dpvc commented 10 years ago

Is this is the kind of workflow that you envision?

Absolutely. There is no problem pulling develop into your feature branch as often as you like. It may also be necessary to do that at the end in order to make a clean pull request.

Is the loop for this feature going to be confusing because it contains the merge commit g

No, the loop will remain separate (you may need to use --no-ff in some cases when there has been no activity on one branch). Here is an example from MathJax:

mj-merge

Note that the yellow loop makes several pulls from the blue branch before eventually being merged back into it. There are eight commits along the yellow (including the two pulls), and there are intervening commits (via other pulls) to the blue branch. The commits for the yellow branch are not intermingled with the others.

(that pulls in all of the file changes from B, C, D -- does it pull in the commits as well?)

Yes, the yellow branch does include all the commits along the blue branch after a pull from it, so it would include B, C, and D, provided other changes in the yellow branch don't supersede them.

Will the commits to B, C, D appear as if they are part of the feature you are building? will they appear [between] commit f and commit h?

They certainly are part of the feature branch once merged in. But the commits are on a different branch, and so can be distinguished (provided --no-ff is used to prevent the loss of loops). I haven't done a lot of cross branch intermingling, like has been done in the past in WW and PG, but my understanding of --no-ff is that it will keep the branches separate in this way, and that has been my experience so far.

One question is where are you thinking about these "appearing"? There are several places you can look at commit messages, and the answer depends on which one you mean. For example, they will be visually distinct in the network diagram, as you can see above. In the history for the yellow branch, however, the commits will be intermingled. In the pull request for the yellow branch into the blue, only the eight yellow nodes will be listed (and none of the blue).

For example, here is the pull request where the yellow is pulled into the blue:

mj-merge-2

You can see it live here. Note that only the eight yellow nodes are listed, so this gives a separate listing of the changes due to the yellow branch.

You can also get a similar listing using GitHub's compare feature (even without a pull request). Here is one for the yellow branch shown above.

You can get this same list from the command line using git log <yellow-branch>...<blue-branch>. From the MathJax repository, you could do

git log 28afef...4400d4

to get the listing.

If you just do git log <yellow-branch>, you will get all the commits, even those from the blue branch that were merged in (so this would include the orange, green and pink changes). This corresponds to the GitHub history feature. Here is the corresponding history for the yellow branch, and you can see that it includes the commits from the other branches of the network.

So the nice thing is you can have it either way: you can get all the commits involved in a branch, or just the ones along a loop.

Now, I should point out that this is for when the feature is branched from develop and eventually merged back into develop, possibly with pulls form develop along the way. I'm not sure of what happens to the logs if there are pulls to the feature branch from other feature branches. If the other branch isn't merged with develop when the main feature is, that will bring part of the second feature into develop, and those commits will show up in the pull request and the comparison listings, I'm sure.

But I expect that it is not common to need to merge one unfinished feature into another unfinished one before they are merged into develop. I would expect most pulling into a feature branch to be from develop (to get the new code that have been added since the feature wash branched, not to get other unfinished code).

So I think that the branching and merging is enough to maintain the different kinds of histories that we are interested in without needing to rebase, provided we are careful about branching from the right places and to the right places.

mgage commented 10 years ago

On Jan 28, 2014, at 8:58 AM, Davide P. Cervone notifications@github.com wrote:

So the nice thing is you can have it either way: you can get all the commits involved in a branch, or just the ones along a loop.

Thanks for the detailed explanation. I think some version of what you posted should be included as an appendix to to our "github" rules. It will be very helpful as we try to get more people into the development and maintainer process for WW.

Now, I should point out that this is for when the feature is branched from develop and eventually merged back into develop, possibly with pulls form develop along the way. I'm not sure of what happens to the logs if there are pulls to the feature branch from other feature branches. If the other branch isn't merged with develop when the main feature is, that will bring part of the second feature into develop, and those commits will show up in the pull request and the comparison listings, I'm sure.

But I expect that it is not common to need to merge one unfinished feature into another unfinished one before they are merged into develop. I would expect most pulling into a feature branch to be from develop (to get the new code that have been added since the feature wash branched, not to get other unfinished code).

I think you are right about that. We'll concentrate on features built by branching off of develop. The exceptions will be rare.

Take care,

Mike

dpvc commented 10 years ago

One other useful command:

git log --graph --decorate --oneline

This produces an ASCII-art diagram of the branching structure (for the current branch). It differs from the GitHub network diagram in that it is in "topological order" rather than in date order. That can sometimes back the branching and commit structure clearer. (You can add --date-order to get something more like the GitHub network.)

Here is the (top of the) result for the release/2.8.1 branch:

*   e6de2b1 (HEAD, origin/release/2.8.1, release/2.8.1) Merge pull request #91 from mgage/develop
|\  
| * bbc7531 Remove compoundProblem4.pl and replace with compountProblem5.pl
| *   1086581 Merge branch 'develop' of https://github.com/mgage/pg into develop
| |\  
* | \   1cb8b26 Merge pull request #89 from dpvc/patch-MathObjects
|\ \ \  
| * | | fdd2f90 Make Formula and Compute return ImplicitPlane objects when appropriate.  This allows ImplicitPlanes to be 
| * | | fcac75d Use pg->debug_message for diagnostic output
| * | | f35668e Fix complex output so that parentheses are used when the complex number is being subtracted.
* | | |   24cf693 Merge pull request #90 from dpvc/patch-PGML
|\ \ \ \  
| |/ / /  
|/| | |   
| * | | 2ca5512 Fix problem with answer rules in answer arrays.  Fix several variable name warnings.
|/ / /  
* | |   b861740 Merge pull request #85 from openwebwork/develop
|\ \ \  
| |/ /  
| | /   
| |/    
|/|     
* |   1312506 Merge branch 'master' into develop
|\ \  
| * | af91595 Fix typo in PG/Translator.pm replacing END_PGML_SOLUTION  with END_PGML_HINT
| * |   32505b1 (origin/release/2.8.1a, origin/master, origin/HEAD, master) Merge pull request #83 from jasongrout/patch-1
| |\ \  
| | * | ae18b2b Make inline matrices use the CSS style for making inline tables
| |/ /  
| * | f98a925 (tag: PG-2.8) update version number(s)
| * |   5de9e97 Merge pull request #40 from openwebwork/release/2.8
| |\ \  
* | | | 7ab0798 Fix typo in PG/Translator.pm replacing END_PGML_SOLUTION  with END_PGML_HINT
* | | |   1f96861 Merge https://github.com/mgage/pg into develop_mg
|\ \ \ \  
| * \ \ \   9006d58 Merge pull request #2 from openwebwork/master
| |\ \ \ \  
| | |/ / /  
* | | | | 5fd302c Removing eval {} commands which are caught on hosted2 (but not on my laptop) ??
* | | | |   ab71bbc Merge branch 'develop' of https://github.com/mgage/pg into develop_mg
|\ \ \ \ \  
| * | | | | fd3c487 Make displaying top labels work with MathJax and with printing matrices

Comparing this to the GitHub network can sometimes be useful.

goehle commented 10 years ago

I believe we have a reasonable plan to deal with this now. We can reopen the issue if we find that we can't keep up.