Closed Hugovdberg closed 7 years ago
I think our workflow is based around git flow in practice, even if we don't use the tools. Each feature is developed in a separate branch (usually on the developers own repo) and the pull requests are the equivalent tool for publishing back to the master
branch. Each of the releases is tagged in a similar way to git flow.
I think the GitHub PR system is pretty powerful for commenting, reviewing, continuous testing etc - I wouldn't want to move away from that. There are quite a few casual ProjectTemplate
developers and you don't want to dissuade them by having another toolset to download before contributions can be made.
There is a good question of whether there should be a master
and develop
branch on this repo. However, if you view CRAN as being the main release version, then this master
is effectively the development branch (or bleeding edge). I personally don't think there's enough development activity to warrant two branches in this repo.
I disagree on the point out master
is treated like the develop
branch in git-flow
. Yes we do use branches off master
which are merged in again later, but then again we are trying to keep master as clean as master
in git-flow
by squashing commits (even after they have been published). If we do consider this master
branch a development branch there should be no need to squash commits.
Also, git-flow
doesn't replace the normal git tools, so as long as the casual developers branch off develop
instead of master
it will just fine. As I said, the git-flow
tools are essentially a bunch of scripts that perform normal git commands we already use, albeit on a slightly more advanced branching convention. There is no need to move away from the GitHub PR's, they can still be used, and I really think they are indeed a great tool as well. The major difference would be that all PR's would be against the develop
branch instead of the master
branch.
It's not that I think we should use a completely different tool set, just enable some tools that make life a little easier. And as you said, informally we already use git-flow
but by actually initialising it in the repository we unleash the full potential without adding any burden to the maintainers (except checking people do PR's against the correct branch).
Oh, I see what you mean. There is no development history on master
. In fact I was concerned about that too for my own changes so I created a branch called archive
in my own repo fork and I push all my development commits to that so I have my own history. Of course I don't get everyone elses.
I also delete all other branches in my own repo so I can see what's going on. So, apart from master
and archive
, all the other branches are stuff I'm working on. And when they are merged, I delete those after merging into archive
.
So yes, in that case it would be nicer to have a full history branch.
I'd still be concerned that people would submit PRs off master
rather than develop
and there wouldn't be an easy way to fix that apart from asking them to redo their work over in a new branch.
It's not too hard, they can use git rebase
to move their changes relative to another feature. Then conflicts that arise because of that are theirs to fix, git rebase
is really powerful in helping you along. And if we don't know how to do it, those magicians over at StackOverflow.com can sure help out.
Yes, I'm all for casual developers joining in (I'm not sure I'm much better than that ;-)), but I think we can ask some effort from them to make everyone else's life a little better (if they want to just share some snippet of code please use a Gist ;-)).
A good practice for Open Source Software is to have bleeding edge work in master with tags indicating stable releases. This prevents features from accidentally being left out. If you look at the branch history we tried a development branch some time ago. These were the lessons we learned:
Offering two branches is confusing Many people expect master to have the most up to date commits. When the most up to date commits are in develop, it is not obvious which is the bleeding edge. Is master at the bleeding edge today or behind develop? To avoid confusion every time a merge is made into develop it must be mirrored in master. Even with git flow this is error prone. A single branch reduces confusion.
Features get lost Manually rebasing a full development history from develop into master is error prone. Often this is done after the feature has been added to develop. Rebasing after the fact sometimes leads to a whole feature being rebased with another. Or worse, the rebasing accidentally overwrites a previous feature. Squashing when the feature is merged is much less error prone.
Github is a pre-CRAN release platform The primary purpose of the GitHub repo is to facilitate distributed development and aggregate many features together for a submission to CRAN. CRAN submission is very time consuming and requires that every new feature be documented in a change log. The git history is not for development but to create the required change log prior to CRAN submission. With every feature squashed in master it is fairly straightforward to copy the git history into the changelog. When full development history is stored in the git history this process becomes very difficult.
Thanks for the suggestion -- I'm always eager to consider new ideas to improve the process. This falls under the category of "tried it and didn't work as well as we hoped"! If keeping a history of all commits is important to you there is always a full history in your local git repository!
Some time ago I came across this blog post that lines out a branching model that I think might be advantageous to us. By splitting into a stable
master
and an unstabledevelop
branch themaster
branch is kept clean, while retaining all information on previous branches and thus removes the necessity to squash commits. There aregit-flow
implementations for all operating systems and on Mac OS and Windows there are also visual tools for those who prefer that over the command line. Introducinggit-flow
into this repository is straightforward and only requires someone with write permissions on this repo to rungit flow init
and push themaster
anddevelop
branches. Also, when people decide to install the github version instead of the CRAN version a stable version is installed by default, even if it is ahead of CRAN. Working withgit-flow
is perhaps even easier than with plaingit
as the commands are all simply shortcuts to combinations of three or four commands. A cheatsheet ongit flow
is available as well: http://danielkummer.github.io/git-flow-cheatsheet/index.htmlHow do you guys think about adopting this structure?