smarkets / marge-bot

A merge-bot for GitLab
BSD 3-Clause "New" or "Revised" License
693 stars 136 forks source link

job.py: don't hard-code branch in error-handling #284

Open Hi-Angel opened 3 years ago

Hi-Angel commented 3 years ago

When MR was created against a branch different from master and ended up with conflicts, bot goes into infinite loop due to getting inside error-handling path a:

fatal: invalid reference: master

Fix this by not hard-coding the master name.

Signed-off-by: Konstantin Kharlamov Hi-Angel@yandex.ru

Hi-Angel commented 3 years ago

upd: fixed description

snim2 commented 3 years ago

BTW should this be the target branch of the MR?

Hi-Angel commented 3 years ago

BTW should this be the target branch of the MR?

I'm not sure I get it… Should what be the target branch?

snim2 commented 3 years ago

Ah, sorry! In an MR, the source branch is the code you want to merge in, the target branch is the branch you want to merge into. So the MR merges source->target. I was just wondering whether it would be more explicit to use the branch name, which is defined just above your change.

So that check would be:

if source_branch != 'master':
    repo.checkout_branch(target_branch)
    repo.remove_branch(source_branch)

However, that still hard-codes the branch name master, and there's no reason why the default branch should have that particular name.

In this case, it's not clear whether we are checking that the source branch is not master just because Marge needs to checkout another branch before removing the source branch, in which case this could just be:

repo.checkout_branch(target_branch)
repo.remove_branch(source_branch)

because the source and target are different, by definition. Or, it may be that there's something else going on here.

Hi-Angel commented 3 years ago

Ah, I see, thank you for explanations!

the MR merges source->target. I was just wondering whether it would be more explicit to use the branch name, which is defined just above your change.

Well, I don't know the code, but from looking at the snippet, I am not sure why couldn't someone add a code to remove the target branch too, unless the target branch is the default one (e.g. usually it is "master", sometimes "main", etc.). In such case we couldn't rely on the target branch name.

I don't see any downsides in checking out HEAD^. And given the idea mentioned above, I think it would be future-proof too.

In this case, it's not clear whether we are checking that the source branch is not master just because Marge needs to checkout another branch before removing the source branch, in which case this could just be:

repo.checkout_branch(target_branch)
repo.remove_branch(source_branch)

because the source and target are different, by definition. Or, it may be that there's something else going on here.

I concur with this thought, me neither knows why the check is in place.

I found the commit where this line was added, it is 321a867d463f37703af426c226ae6ddce5691c7c "Refactor jobs", and it has rewritten a bunch of code. So, I guess git-history in this case wouldn't tell us. But we could ask @JaimeLennox, the author of the commit: did you happen to know, could there be any side-effects if we'd just remove the check for master?

snim2 commented 3 years ago

Well, the target branch is the one that should receive the changes from the merge request, so probably the user will not want to remove it, and if the target is master or main, then it is likely that branch will be protected and cannot be removed.

Hi-Angel commented 3 years ago

Well, the target branch is the one that should receive the changes from the merge request, so probably the user will not want to remove it,

I'd argue for the same reason a user wouldn't want to remove the source branch, and yet marge-bot does it ☺

and if the target is master or main, then it is likely that branch will be protected and cannot be removed.

AFAIK protected branch is not a feature of git (it's a feature of gitlab/github), so removing a branch should not be a problem. In this case it's just a branch that resides locally on a computer, and hopefully reflects the actual branch on the remote server.

Hi-Angel commented 3 years ago

I'd argue for the same reason a user wouldn't want to remove the source branch, and yet marge-bot does it

Sorry, I figured it needs elaboration: I mean, the code in question removes source branch on failure to merge it to the target branch. But a user probably will want to merge it eventually, so they wouldn't benefit from removing the branch.

But then of course there might be corner cases, such as that a user abandoned the branch and it will be left forever… But a similar thing could be said about the target non-default branch as well ¯\(ツ)

Hi-Angel commented 3 years ago

Although, now that I think of that, maybe the reason marge-bot even removes the branch is not to free space the branch name takes, but it might be because the way it brings local code up-to-date with the remote is simply by git checkout branchname. So, if branchname is left from previous error, it would have an obsolete version of code. But if that's true, then trying to remove it on fail is error-prone too IMO. I'd say that more correct solution would be to always use same branch name, let's say tmp, and do git reset --hard of that branch whenever we want to update it.

Though, at this point I feel like all that design discussion might be getting offtopic ☺ The PR here is just a fix for a particular problem, no more, no less ☺

Hi-Angel commented 3 years ago

So, any updates here? I can apply suggested changes if maintainers want so, all I care for now is for the fix to get merged, so we don't have to carry a local patch for the problem.