mozart / mozart2

Mozart Programming System v2
http://mozart.github.io/
BSD 2-Clause "Simplified" License
566 stars 96 forks source link

Merge back all the submodules in a single repo? #20

Closed sjrd closed 11 years ago

sjrd commented 11 years ago

I've been wondering if having all those submodules is actually a good idea... It certainly hampers contributing fixes or improvements that are spread over several submodules (which happens quite often, actually - just consider the link between Oz functors and the C++ builtins).

Do you think it would be a wise move to merge back all the repositories, or a subset of them, into a single mozart2.git?

I would be particularly keen on merging vm, lib/main, lib/compiler, and boosthost. Then maybe fork back boostenv+boosthost in one other repo. Or, similarly (but maybe with a better history), fork out boostenv from vm and merge it with boosthost; and then merge vm, lib/main and lib/compiler.

What do you think?

sjmackenzie commented 11 years ago

i understand your need for it.

there is, i believe an untapped feature in this distributed layout: embedding oz in small application, in text editors ala emacs, embedded devices etc. facilitating 'the internet of things' ... pause

for example a programming language could include the vm and immediately get access to dataflow concurrency, or at least create a library that abstracts it out, they have no need for stdlib in many cases.

making oz as mixable as possible is probably a better way forward. a project i'm working on indeed requires that the vm is separate.

although - i would love the fork out boostenv from vm and merge it with boosthost but NOT the merge lib/vm, lib/main and lib/compiler i dont mean to rain on the parade, but i have a few reservations on stdlib. i believe it can be better and evolve more. (reading input from a keyboard comes to mind) take a look at python.... well not all of it! but there are some pretty succinct APIs there. by us merging how much do we lose? changing stdlib means we break backward compatibility. but what about the guys who do want to break it?

i certainly dont want a D programming language debacle with two stdlibs but it at least allows platforms to experiment.

also the compiler is being rewritten, it might be easier to just keep them separate

i vote lets sit on it a while (except the fork out boostenv!)

sjrd commented 11 years ago

OK I think I get your point. Anyway I'll wait for input from other people on this subject. I'll fork out boostenv after we get alpha1 released.

sjrd commented 11 years ago

I happened on this blog post today: Why your company shouldn’t use Git submodules

sjmackenzie commented 11 years ago

Excellent find. This has pushed me over the edge. For the next release shall we merge repos?

sjmackenzie commented 11 years ago

I would say this issue is probably the most annoying one. I'm looking forward to when we can merge...

sjrd commented 11 years ago

OK I'll give it a shot this week-end. I have couple merging schemes in mind. I will prepare some, and then ask which you think is best.

yangsx commented 11 years ago

This tutorial might be useful to you: http://blogs.atlassian.com/2013/03/git-submodules-workflows-tips/

sjmackenzie commented 11 years ago

thanks mate!

sjrd commented 11 years ago

Well, I figured I would never have the courage to perform the very elaborate merge strategies I had come up with to keep and reconstruct a clean history. So I settled for the fast way of doing it, figuring out that progress is more important than an impeccable history.

The result is there: https://github.com/sjrd/mozart2/tree/merged I have not merged stdlib/, because this has been a separate module of Mozart since its CVS times!

What I did: I first removed the submodules, then made subtree merges, as is explained here (not including Pulling changes). The main drawback of this strategy is that, although commits from the submodules do appear in the new global history, history of individual files is "cut" at the merge point. I.e., a git blame won't show the right thing, git log stops at the merge commit, etc. You can still look up manually the history before the merge by referencing the last commit of the "old" history explicitly. It also screws up any statistics about contributors, as I now appear as the author of pretty much everything :-s

The main alternative is to instead: in each repo, filter-branch it to move all files to their appropriate subdirectory; then merge. I'll try that next time. This would break commit hashes, but it would probably yield a much more useful resulting history. If anyone wants to give it a shot, feel free to do it. I won't try it today, and when I do, I'll post another comment before I begin.

Edit: just in case it is not clear from the above: Do NOT bring this into master! This was the WRONG way! The good way is the filter-branch alternative, which is yet to come.

sjmackenzie commented 11 years ago

Forgive me, I haven't slept nor am able to. But I have to test this.

For some reason boost_program_options isn't playing well.


Unable to find the requested Boost libraries.

Boost version: 1.53.0

Boost include path: /usr/include

The following Boost libraries could not be found:

      boost_program_options

Some (but not all) of the required Boost libraries were found. You may

need to install these additional Boost libraries. Alternatively, set

BOOST_LIBRARYDIR to the directory containing Boost libraries or BOOST_ROOT

to the location of Boost.

Call Stack (most recent call first):

Just to be sure, you only did a merge, no code changes?

sjrd commented 11 years ago

There is zero line change. It is really just a merge. Maybe you have not included all the options you used to include with cmake?

sjrd commented 11 years ago

OK, here is another, much better, attempt: https://github.com/sjrd/mozart2/tree/filtered-merged This one was made by filter-branch'ing all the submodules to their respective subdirectory. And then merge everything into one branch using an octopus merge. This is much better as the previous attempt, as it keeps all the histories with the appropriate author information, etc.

The only thing that is lost here is the grouping of related changes in different repositories that commits such as this one brought. This too can be improved, but it would probably be hours worth of work: instead of making one big octopus merge at the tip of our current history, we should do one octopus merge for each such commit (I mean commit in mozart2.git in which at least one submodule ref is added or updated). Then the rest of the history of the other branches should be rebased off that octopus merge.

E.g., if commit A in mozart2.git updates submodules vm/ and lib/main/ with new commits B and C, and the next (child) commits of B and C are X and Y, respectively, then: A' would be an octopus merge of B and C on top of A's parent (plus any additional change made by A himself), and X and Y would be rebased off A'.

Although this is theoretically very appealing ^^, I am not sure it is worth the effort. Anyways, that rewriting could be done later, if we accept the rewriting I am suggesting now.

WDYT? Shall we make https://github.com/sjrd/mozart2/tree/filtered-merged the new mozart2 master? (Keeping for a day a possible refinement of the rewriting with intermediate octopus merges.)

Review by @sjmackenzie @kennytm @ggutierrez please.

sjmackenzie commented 11 years ago

Sébastien this solution looks much better!

I, for one, am more than accepting towards it.

sjrd commented 11 years ago

Done!

The master branch of mozart2 now contains all the merged things. I have also put a redirection message on the GitHub repositories of the former submodules.

The old history of mozart2 can be found in branch https://github.com/mozart/mozart2/tree/backups/pre-submodule-merge-master For the submodules it is still in branch master.

If you have any branch somewhere, please rebase on top of this master. I want to have a single point of merging. If you have any branch of the submodules somewhere, please also make sure to do whatever's necessary so that they can be rebased on top of this new merged master. If there are not many commits, probably the easiest way is to replay them more or less manually. If you have along history, you'll have to filter-branch + rebase. If in doubt, do not hesitate to contact me.

sjmackenzie commented 11 years ago

Nice one mate! Much appreciated!