Support git submodule dependencies between changes

GoogleCodeExporter commented 9 years ago

One of the projects I work on uses git submodules for different parts of
the application, each submodule is added in gerrit as a separate project.

Sometimes code committed in one of the submodules is dependent on code
commited in either the main project or another one of the submodules, it
would be nice if we could manually specify this in the commit. Something
like Depends-On: Iefb00f7a27b39d664c8cee4923c01409431bc3df

Original issue reported on code.google.com by Shane...@gmail.com on 6 Jan 2010 at 8:10

GoogleCodeExporter commented 9 years ago

Do you really want to do this manually?

What we actually want to do is have Gerrit understand the submodule repository
structure and automatically setup these dependencies based on the actual commit
SHA-1s.

In your example, the supermodule change wouldn't be able to be submitted until
the submodule commit(s) it points to is submitted to the submodule project.

By doing this we also hope to support automatically updating the supermodule if
a submodule changes and the supermodule is configured to automatically track the
submodule.  This would allow engineers to submit changes into a subproject and
not worry about every single update in the supermodule.

Original comment by sop@google.com on 11 Jan 2010 at 8:17

Changed title: Support git submodule dependencies between changes
Changed state: Accepted
Added labels: Type-Feature
Removed labels: Type-Bug

GoogleCodeExporter commented 9 years ago

I'd be much happier with it being automatic :)

I didn't think of that approach because on the project I am on we use the 
cherry-pick
method of merging (We wanted the Reviewed-On/Reviewied-By/Tested-By bits in the
commits), so the SHA1 that actually gets committed to the central repo doesn't 
match
the SHA1 I committed locally, (This means that after the change in the 
submodule has
been committed, someone (in our case, this someone crontab) has to go update the
submodule pointer.) so I didn't think it would be possible.

Original comment by Shane...@gmail.com on 11 Jan 2010 at 9:04

GoogleCodeExporter commented 9 years ago

Automatic submodule dependency tracking as you describe it may solve the whole 
problem.

However, being able to manually specify "Depends-On: xxxxx" in a commit message 
would
be very useful for people using repo manifests. This would allow you to create
inter-project dependencies when they are warranted. Currently there is no way 
to let
Gerrit know that a change in one repository should be submitted only after a 
change
in another.

Usually this doesn't affect that many changes so I think manual creation is 
fine.
Also, to make it easy on the developer being able to put either the Ixxx change 
id or
the original git SHA-1 would be nice (the SHA-1 can be grabbed before 
everything has
been uploaded to gerrit).

Original comment by OdinG...@gmail.com on 9 Apr 2010 at 11:20

GoogleCodeExporter commented 9 years ago

This is being worked on by some folks.

https://review.source.android.com/14033

Original comment by sop@google.com on 25 Apr 2010 at 2:57

Changed state: Started

GoogleCodeExporter commented 9 years ago

Shawn, do you envision this as replacing the current repo manifest?  I'm not 
sure if this change encapsulates your long-term plans [1].

I'm curious because we are starting lots of scripted testing using repo, and 
are running into problems such as Issue 69 [2].  I'm not sure if we should 
spend time fixing issues with the current manifest format, or help convert to 
the new format.

Are you at a point where you could use help with [1]?

[1] 
http://groups.google.com/group/repo-discuss/browse_thread/thread/a337001e3495bb2
3/3123cdd6186ee7ca
[2] http://code.google.com/p/git-repo/issues/detail?id=69#c0

Original comment by bklarson@gmail.com on 16 Aug 2010 at 8:07

GoogleCodeExporter commented 9 years ago

Ugh - that is issue 69 in git-repo, not the closed issue in gerrit.

Original comment by bklarson@gmail.com on 16 Aug 2010 at 8:14

GoogleCodeExporter commented 9 years ago

Why limit this to submodules. I haven't worked on a project that uses 
submodules yet, but I do use repo. The same could apply to repo uploads. The 
ability to manually add a dependency (via web or git comment) would be awesome. 

I currently have a change submitted to AOSP, which touches several 
repositories. Currently they are only linked by the title of the commit - 
someone could potentially try to review the change to just one of the 
repositories. 

+1

Original comment by mar...@longhome.co.uk on 1 Mar 2011 at 11:04

GoogleCodeExporter commented 9 years ago

Isn't this pretty much accomplished with submodule registration in 2.3?

Original comment by choro...@wikimedia.org on 16 May 2012 at 4:32

GoogleCodeExporter commented 9 years ago

[deleted comment]

GoogleCodeExporter commented 9 years ago

I'm currently working on implementing this, as submodule registration isn't the 
solution for me. However, I could do with some pointers from a friendly dev who 
understands the gerrit codebase....

Specifically, I'm starting by modifying GetChange.java to walk through 
Depends-On: footers. I believe I need to use ChangeControl.Factory to get the 
Change for the given Change-Id, however adding ChangeControl.Factory as a 
constructor argument to GetChange isn't enough, and I run into the error at the 
end of this message.

I'm fairly proficient at standalone java, however Guice and servlets are 
relatively unchartered territory for me, so be gentle :)

[2014-03-16 15:49:47,240] ERROR com.google.gerrit.pgm.Daemon : Unable to start 
daemon
com.google.inject.CreationException: Guice creation errors:

1) No scope is bound to com.google.inject.servlet.RequestScoped.
  at com.google.gerrit.server.project.PerRequestProjectControlCache.class(PerRequestProjectControlCache.java:34)
  while locating com.google.inject.Provider<com.google.gerrit.server.project.PerRequestProjectControlCache>
    for parameter 0 at com.google.gerrit.server.project.ProjectControl$Factory.<init>(ProjectControl.java:105)
  while locating com.google.gerrit.server.project.ProjectControl$Factory
    for parameter 0 at com.google.gerrit.server.project.ChangeControl$Factory.<init>(ChangeControl.java:125)
  while locating com.google.gerrit.server.project.ChangeControl$Factory
    for parameter 2 at com.google.gerrit.server.change.GetRelated.<init>(GetRelated.java:74)
  at com.google.gerrit.extensions.restapi.RestApiModule.view(RestApiModule.java:82)

1 error
    at com.google.inject.internal.Errors.throwCreationExceptionIfErrorsExist(Errors.java:448)
    at com.google.inject.internal.InternalInjectorCreator.initializeStatically(InternalInjectorCreator.java:155)
    at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:107)
    at com.google.inject.internal.InjectorImpl.createChildInjector(InjectorImpl.java:230)
    at com.google.gerrit.pgm.Daemon.createSysInjector(Daemon.java:356)
    at com.google.gerrit.pgm.Daemon.start(Daemon.java:274)
    at com.google.gerrit.pgm.Daemon.run(Daemon.java:200)
    at com.google.gerrit.pgm.util.AbstractProgram.main(AbstractProgram.java:63)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at com.google.gerrit.launcher.GerritLauncher.invokeProgram(GerritLauncher.java:167)
    at com.google.gerrit.launcher.GerritLauncher.mainImpl(GerritLauncher.java:94)
    at com.google.gerrit.launcher.GerritLauncher.main(GerritLauncher.java:51)
    at Main.main(Main.java:25)

Original comment by phil.le...@googlemail.com on 16 Mar 2014 at 3:53

GoogleCodeExporter commented 9 years ago

Should have read GetRelated.java, not GetChange.java

Original comment by phil.le...@googlemail.com on 16 Mar 2014 at 4:55

GoogleCodeExporter commented 9 years ago

OK, I've made progress and am now battling my way through the gwtui portion of 
the code. For anyone ending up here who wants to know the solution, it's to use 
ReviewDb's methods to create objects. e.g. :

List<Change> changes = dbProvider.get().changes().byKey(changeKey).toList();
PatchSet patchSet = dbProvider.get().patchSets().get(currentPatchSetId);

...and so on.

Original comment by phil.le...@googlemail.com on 16 Mar 2014 at 8:09

GoogleCodeExporter commented 9 years ago

Prototype patch available at https://gerrit-review.googlesource.com/55243

Original comment by phil.le...@googlemail.com on 18 Mar 2014 at 8:55

GoogleCodeExporter commented 9 years ago

As a follow up on this:
If you want to have something similar, checkout the `change.submitWholeTopic` 
config in the master branch of gerrit. That works across different projects.

Also another thing to be handy in this situation is the submodule subscription
https://review.openstack.org/Documentation/user-submodules.html which sounds 
more
like the original issue. 

I'll be working on combining these two things (submit different submodules at 
the same time in the same topic, such that you only get one superproject 
commit. That's my current vision at least).

Original comment by sbel...@google.com on 18 Feb 2015 at 10:56

GoogleCodeExporter commented 9 years ago

submitWholeTopic does what ShaneMcC was asking for. Submodule subscription is a 
different thing.

Original comment by jrn@google.com on 23 Jun 2015 at 12:03

Changed state: Released

GoogleCodeExporter commented 9 years ago

submitWholeTopic does not do precisely what ShaneMcC was asking for.  What he 
wanted (5 years ago) was a gate to hold some commit from being merged if other 
related topics are unmerged; however, submitWholeTopic is a tool to allow 
someone to cause all related topics to be submitted with one simple action.

Consider I have two changes, A and B, both in the same topic (if that's to be 
our control mechanism).

With submitWholeTopic, I get this feature:

   When viewing change A, there is a SubmitWholeTopic button.  
   If I press it, A and B are both submitted.

What Shane wanted (and I want) is this:

   When viewing change A, there is a Submit button.  
   If I press it, then if
       B is already merged  => A is merged ;
       B is "merge pending" => A and B are both merged ;
       B is not submitted   => A is "merge pending"

Quite simply:
  A is treated as if B is its ancestor.
  B is treated as if A is its ancestor.
  One cannot be merged without the other also being merged.

Original comment by phil.hord on 23 Jun 2015 at 9:36

GoogleCodeExporter commented 9 years ago

Phil,

So here is how I understand the underlying message, correct me if I'm wrong:

What you describe are atomic guarantees between A and B. So you actually care 
about
"Either both of A and B are not merged, or both of A and B are merged, there 
should be no state in which A or B is merged, but not the other." Mind that 
there is no
assertions about SUBMITTED vs NEW.

The submitWholeTopic tries to minimize the time of the forbidden state being 
observable, but doesn't eliminate it. This is because we started to tackle this 
"two changes are set to the MERGED state atomically" problem in a way which has 
most impact first. And having minimized the time of the undesired state is 
doing what you want mostly.

Currently I am working on getting rid of the MergeQueue (and eventually the 
SUBMITTED state), because it doesn't make sense to keep that around. You always 
had the uncertainty of SUBMITTED, Merge Pending and alike, which is not what 
the user wants, but the user wants a clear feedback of when a change is not in 
and when it is.

When getting rid of the MergeQueue, we will not send emails any more about 
failed merges, but rather the UI will tell you if the merge (of both A and B 
grouped together by the topic) succeeded or not. That way we will minimize the 
undesired state of one of A,B being merged while the other isn't, a bit 
further. 

Eventually later we want to have both A and B to be merged within a single data 
base
transaction.

Original comment by sbel...@google.com on 23 Jun 2015 at 10:17

GoogleCodeExporter commented 9 years ago

I've moved jobs since this was an issue for me, but from what I recall,
that's essentially correct.

The more common scenario is where the superproject S has dependencies on A
& B, and change B shouldn't be merged until change A has been merged.
However, sometimes cross-dependencies between A&B were unavoidable due to
the architecture of the project being modified (and since it's a big
open-source project, restructuring wasn't a viable option).

This is all to do with plugins for S and introduction library functions in
A to be used by B (and possibly C).

Original comment by phil.le...@googlemail.com on 24 Jun 2015 at 5:08

GoogleCodeExporter commented 9 years ago

I appreciate the efforts you are taking to understand my issues here.  Thanks.

I think you are still missing my own motivation.  Since my read of the OP's 
wording matches my own needs exactly, I tend to think he thinks the same as I 
do.  But I cannot say for sure.

I am not so concerned about the atomicity of the commit action on A & B. I am 
more concerned with policy.  At the risk of confusing you even more, let me 
change the example premise a little bit;  let's say "B depends-on A" with the 
understanding that this does not mean automatically that "A depends-on B".

For my project, B-depends-on-A means that the changes in B will not work 
without the changes in A.  For example, A may introduce a new API and B uses 
this new API.  If this were a single repo I would merely ensure that A is an 
ancestor of B and Gerrit (and Git) would take care of the rest, so long as I do 
not use "Submit type: Cherry Pick".  

In fact, this is a good example to understand the desired behavior.  In a 
single repo, if Y depends-on X, then I may CR=+2 on Y and CR=-1 on X.  And I 
can safely "submit" Y; it will not get merged right away because it depends-on 
X.  Some day X may get merged, and then Y can get merged too.  But it is not 
important to me that Y gets merged right away, atomically or close to it.  It 
is only important to me that Y is prohibited from merging _by policy_ until X 
is also merged.

So, let's look at A and B again.

  B depends-on A  ==>  B may not be merged until A is also merged.

For this issue (#377) I do not care if B is merged immediately, or atomically, 
or "with all due speed".  I only care that it is not merged before A. The API 
which A introduces may have other customers or none at all.  The merit of 
commit A does not extend to B, nor vice versa.  So I may safely merge A without 
needing to merge B, and I can do so far in advance of merging B.

Now, another example:

  B depends-on A  ==>  B may not be merged until A is also merged.
  C depends-on A  ==>  C may not be merged until A is also merged.

This does not mean that C depends-on B.  C is merely another customer of A's 
new API.  

  C may not be merged before A.
  C may     be merged before B.
  C and B are unrelated, except that both depends-on A.

Here is the example I think you are trying to solve with submitWholeTopic:

For some commits D and E, 

  D depends-on E
  E depends-on D

This is a reasonable situation.  For example, perhaps D changes an existing API 
which is used only in two repositories.  D and E contain the required changes 
to each repo.  Neither is acceptable without the other one.

This is a specialization of this issue and it should be considered, but it is 
not the general issue.  It can be solved by solving the general issue, though.  
Suppose D is reviewed and submitted.  Then we should end up here:

  D: Submitted; merge-pending  (Cannot be merged because depends-on E)
  E: New change; Needs review  (Cannot be merged because needs-review)

Later when E is reviewed and also submitted, we briefly are in this state:

  D: Submitted; merge-pending  (depends-on E)
  E: Submitted; merge-pending  (depends-on D)

But you can see that both commits conditions' are met, both of these commits 
may be safely merged.

  D: merged
  E: merged

1. You *should* merge both of these atomically (as closely as possible) but 
this still is not the point of this issue #377.

2. You *could* merge both of these to some candidate-branch in each repo and 
only move the branch-heads after you ensure that all dependent changes were 
merged successfully.  I think this might be nice, but it is outside the scope 
of this issue #377.

3. You *could* safely merge each of these commits and then add a single commit 
to the superprojects (which are subscribed to them) with updates to both 
submodule gitlinks atomically, if the projects belong to some 
registered_submodules.  But this also is not the point of issue #377.  
(However, it does nicely handle the atomicity for you in the case of submodule 
subscriptions, except that if there is some merge-failure you do still need to 
undo _something_ so your errant repo is still a valid merge target for other 
commits.)

So, to recap, what I want is this:

  B depends-on A  ==>  
     Policy: B may not be merged until A is also merged.

Contrarily, submitWholeTopic gives me this:

  A belongs to foo; B belongs to foo  ==> 
     Convenience: There is now a button to submit *all* 
                  'foo-related' commits approximately at once.

These are two very different things.

Original comment by phil.hord on 24 Jun 2015 at 9:25

GoogleCodeExporter commented 9 years ago

When submitWholeTopic is enabled, there is no way in the UI to submit a single 
change without submitting the rest of the topic.

Original comment by jrn@google.com on 24 Jun 2015 at 9:35

GoogleCodeExporter commented 9 years ago

Ok, now I understand your position very well with the API provider and 
consumer. And as A and B are in different projects we cannot do a 
ancestor/parent relationship as Git would provide within a single repository.

Looking at the proposed solution in the first post, I think it doesn't cut it 
yet.

So say you have repoA and repoB which you develop changes A and B for. A 
"depends-on" B. You can have a change (with the same change id, but different 
legacy change numbers), targeted at different branches, so what you would 
really need to write down policy-wise is:

A "depends-on" (B is integrated in repoB/master) 

[B could also be just cherry-picked to repoB/proposedIntegration for testing or 
such]

Regarding examples D & E:
1) Assuming we were to have such a policy enforcement in place, a non-atomic 
merge
would violate this policy, which would be bad design (why would we allow Gerrit 
to violate a policy which it is told to enforce?)

3) I thought a similar idea is very a neat, so I had code for it[1].
Say change A in repoA and change B in repoB are in the same topic (the special 
case with circular dependency),
and both would be submitted and we have submodule subscriptions enabled such 
that
changes in repo{A,B} will feed into a common superproject, then both changes A 
and B would come into the superproject via a single commit updating both 
submodules in that commit.
I want to revive that topic, once I am done with the MergeQueue refactoring 
though.

[1] 
https://gerrit-review.googlesource.com/#/q/topic:towards-atomic-submodule-update
s

Original comment by sbel...@google.com on 24 Jun 2015 at 9:48

GoogleCodeExporter commented 9 years ago

jrn: Yeah but this single change submission may be desired when building out a 
dependency graph.

Say we have new API change A, B and C (each in its own repository repo{A,B,C}) 
and then we also have changes bringing in new features depending on these new 
APIs. (D depends on {A,B}), (E depends on {B,C}), (F depends on B)

Because the APIs are wonderfully documented and you know how they will work 
eventually, you can write the code and test with mocks and submit the features.
They would be merge pending until the depending APIs are actually merged.

So currently you could not model this.

Original comment by sbel...@google.com on 24 Jun 2015 at 9:53

GoogleCodeExporter commented 9 years ago

jrn> When submitWholeTopic is enabled, there is no way in the UI to submit
jrn> a single change without submitting the rest of the topic.

Yeah, and that's a problem for my use case.

sbel> 1) Assuming we were to have such a policy enforcement in place, a 
non-atomic 
sbel> merge would violate this policy, which would be bad design (why would we 
sbel> allow Gerrit to violate a policy which it is told to enforce?)

For the sake of simplicity, I overstated the dependency a bit.  I only want 
Gerrit to refuse to merge E until it agrees to merge D.  After that, I only 
hope Gerrit will do a best-effort attempt at carrying out the merge goals.

I take your point about the change-id since they can (now) be re-used on 
different branches.  I actually have an implementation of this "Depends-on" 
relationship in Jenkins, but there I am on the other side of the fence.  When 
Jenkins wants to test a changeset, I first have it search the "new" patches for 
any "Depends-on:" directives in the commit log.  All the found directives are 
queried in each submodule of my project using something like this:

   gerrit query --current-patchset is:open "$@"

Any matching commits are fetched and merged, and the newly arriving commits are 
again searched for new "Depends-on:" directives.  (That is, it is recursive.)  
This continues until no _new_ directives are found.

Notice my query searches only for the latest patchset of is:open changes. This 
means that an abandoned change becomes a non-dependency.  This is fine for 
Jenkins since the point of the dependency system here is to help Jenkins 
SUCCEED. If there is a real dependency here and it gets ignored, Jenkins will 
presumably fail during 'make' and the hapless author will need to investigate 
why.

If the query does not match anything, I ignore it. I could complain instead, 
but for Jenkins' needs, I am willing to fail during the build instead (VRFY:-1 
gets people's attention), or to pass if the dependency is already met somehow 
else.

A query may match multiple items in the same and/or different submodules; this 
is allowed so long as they all merge cleanly.

A query which matches the same change-id on multiple branches is likely to 
cause pandemonium and will fail to merge.  This would be a false-negative for 
Jenkins, but it is an acceptable fallout for such irresponsible construction.  
In practice we do not have many of these; less than 5% of our changes are ever 
cherry-picked to another branch, and something less than 1% ever have a 
"Depends-on" directive; the intersection of these sets is miniscule.

But I can see how this could be a problem for Gerrit.  I would expect that 
Gerrit's policy might be to refuse to merge E unless ALL(D) are also 
merge(able).  This would be overly restrictive for some, but it's better than 
being too permissive.  In particular, I don't think you can say E(master) 
depends on D(master) or similar branch-matching heuristics, because my 
superproject may lean on different branch names in different projects.  Gerrit 
should not presume to know my wishes.  But if it were really a problem, I could 
do this (though it does not scale):

  Depends-on: I7a6d7ff0ce95f3cfe2787eef5959428664bc0d1d branch:master

I say it does not "scale", but I mean it does not automatically get updated 
when you cherry-pick this commit to some other branch.  I personally do not 
think this is a specialization worth solving, but others may be more interested 
in it.

In practice my directives look like this:

   Depends-on: I7a6d7ff0ce95f3cfe2787eef5959428664bc0d1d
   Depends-on: I68b5e642186869c0c4e484ad12ba57b1f86e75e9

I use change-id's because I know them before I have to push, so I can set up my 
dependencies ahead of time.  But if I know the change number of some commit 
already, I can use that instead:

   Depends-on: 18532

And if I want whole-topic dependency (even if _this_ change is not in that 
topic), I can do this:

   Depends-on: topic:FooBar

Or this:

   Depends-on: topic:FooBar branch:master

Original comment by phil.hord on 24 Jun 2015 at 10:43

evanchueng / gerrit

Support git submodule dependencies between changes #377