reconquest / atlassian-external-hooks

External Hooks plugin for Atlassian Bitbucket
https://external-hooks.reconquest.io
Other
44 stars 37 forks source link

Update to work with git 2.11+ #50

Closed dakotahawkins closed 4 years ago

dakotahawkins commented 7 years ago

Possible duplicate of #49

See Atlassian notes here: https://jira.atlassian.com/browse/BSERV-9388

Our server recently updated git to 2.11, and pre-receive hooks started failing with "bad object" errors on new branches.

This extension needs to be updated to work with git's new handling of new objects with pre-receive hooks.

Edit: It looks like the fix may be to pass some GIT_ environment variables to the pre-receive hook:

Sources:

Is there any reason not to allow GIT_* variables to pass through to the hooks?

seletskiy commented 7 years ago

@dakotahawkins: Hmmm, thank you for detailed report. However, I need to think how to fix that issue in graceful way, so it will not break previous versions of Bitbucket & Git. If you have a proposal, I can review and merge PR.

dakotahawkins commented 7 years ago

@seletskiy This issue caught me by surprise today, and I have just started looking into hook development, and so far know very little.

Is it possible that passing those vars through (even if they're unset or don't exist) would work for new and old git/bitbucket versions? In other words, would the new versions use them and the old versions ignore them?

It seems like it is probably just going to be broken if somebody has git 2.11+ but an old version of bitbucket; I think the bitbucket fix was probably passing those through to the hooks in the first place, and now perhaps it's just that git 2.11+ operations run from your hooks (say from a shell script) may just want them in the environment to work.

From the gitlab diff it looks like there weren't any special checks required, and they only needed to pass two of those variables along. Do you think the best thing to do would be to follow their example?

Presumably here you could also check the git version and only pass along the extra variables if it's 2.11+ for the time being to at least make sure older versions of git don't break.

Edit:

From this comment thread, linked in the atlassian details above, it may be out of your hands anyway:

The new quarantining behavior that was added for pushes creates a random temporary directory where the newly-received objects are written, and exports GIT_OBJECT_DIRECTORY, GIT_ALTERNATE_OBJECT_DIRECTORIES and GIT_QUARANTINE_PATH. However, the PreReceiveHook and PreReceiveRepositoryHook SPIs in Bitbucket Server have no mechanism for providing environment variables to implementors. That means it's not possible for you to get those values and use them yourself to wire up JGit's objects.

:'(

seletskiy commented 7 years ago

@dakotahawkins: That sucks. According to similar patch in yacc (https://github.com/sford/yet-another-commit-checker/commit/525e6fcc2306a18721179038add8879b10576ee3), there is ScmBuilder, provided by SPI, maybe, there is some workaround to obtain paths to newly pushed objects.

dakotahawkins commented 7 years ago

@seletskiy Or maybe there is something you can do with git through the ScmBuilder to force it to output the values of those variables. I kind-of doubt it though, I'd imagine it is the way it is for "security" purposes, to make sure you can only do "safe" whitelisted things with the repo.

I'm trying to get our work to downgrade to 2.10.2 but even that is going to be like pulling teeth. I don't have shell (or root, for that matter) access to the server and the only way to do that on linux is I think to build 2.10.2 from source, since those binaries are long gone from the git-core PPA repo.

itay commented 7 years ago

It seems like you will have to wait for 5.0 when the new SPI becomes available that exposes those variables (so you can pass them to the external process), or tell people to not upgrade to 2.11.

dakotahawkins commented 7 years ago

@seletskiy @itay Do you guys think it would be possible to make this plugin use CommandBuilder instead of ProcessBuilder?

According to this YACC issue it will pass along these variables to its child process. In that example it's forking git, but for this plugin the command would be whatever is supplied by the user.

Starting from Bitbucket Server 4.13, whether you're using the "free-form" builder to assemble your own commands from scratch, or you're using the "type-safe" builder (for example, GitRevListBuilder), all Commands created by the builder will automatically apply the quarantine environment variables necessary when run inside a pre-receive hook (the only place the quarantine environment needs to be applied).

Looks like it supports setting your own environment variables, setting the working directory, and supplying arbitrary parameters.

https://developer.atlassian.com/static/javadoc/bitbucket-server/4.3.1/api/reference/com/atlassian/bitbucket/scm/CommandBuilder.html https://developer.atlassian.com/static/javadoc/bitbucket-server/4.6.0/api/reference/com/atlassian/bitbucket/scm/CommandBuilderSupport.html

itay commented 7 years ago

My understanding is that it is not possible - the plugin is not trying to execute Git commands, it is trying to execute your custom process (which in turn might execute some Git commands). Given that, it has to have access to those environment variables, which today it cannot.

dakotahawkins commented 7 years ago

@itay CommandBuilder is presumably different from the ScmCommandBuilder in that it executes arbitrary commands.

Since the YACC issue mentions using it to fork a git process presumably it could execute the hook executable supplied to this plugin.

itay commented 7 years ago

I thought CommandBuilder was an interface and it's only known implementation is ScmCommandBuilder? I may be missing something though.

Re-reading the docs, it seems possible that you could use ScmCommandBuilder to actually execute non-git binaries as well, though it would definitely be pushing the API in a weird way to do that. I think @seletskiy would have to give it a shot though and see if it's possible.

The other option is that the 5.0 API will allow this to be solved with access the environment variables.

dakotahawkins commented 7 years ago

Even the ScmCommandBuilder is listed as an interface in the docs, so IDK. It even sounds like that can execute arbitrary commands though, and the interfaces provided by the CommandBuilder and CommandBuilderSupport sound like they might be enough to run the hook executable.

itay commented 7 years ago

Definitely something to try. Re-reading that comment thread though, I think that it is possible the only concrete implementation of CommandBuilder may be the GitScmCommandBuilder (referencing the comment by Bryan), and that may be hardcoded for the git binary.

dakotahawkins commented 7 years ago

GitScmCommandBuilder is an interface in the docs as well, so who knows.

bturner commented 7 years ago

For what it's worth, I know. As @itay thought might be the case, GitScmCommandBuilders are indeed restricted to use the git binary; it's a builder specifically for building git commands, so it's only logical it would only work with that binary.

In 4.x, there's no easy way to fix this. In 5.x, the ExternalPreReceiveHook can be converted from implementing PreReceiveRepositoryHook to PreRepositoryHook. When run as part of handling a push, the RepositoryHookRequest provided will include ScmHookDetails which provide access to the quarantine environment variables.

4.x, unfortunately, can't do anything that simple. But the problem's not entirely intractable; it's just going to be very ugly. Given you can't use CommandBuilder and friends to build your commands (since the plugin is most likely forking a script, rather than git), you don't get any of the magic I added to setup the quarantine environment. However, you can still use the builders to access it...indirectly. I want to be clear that this code isn't ideal, but it's reasonably straightforward. It's probably the "simplest" way to make this all work, despite the ugly.

With all that said, here's what I'd do:

(Edit: Note that you don't need to call() the Command you built. Just build it, access the environment and let the command go out of scope. You don't actually need to run a git process.)

I've left out error handling and other code; I'm just illustrating an approach. I'd probably gate all of this behind a check to new Version(ApplicationPropertiesService.getBuildVersion()).compareTo(new Version(4, 13)) >= 0 so you don't bother with it on older versions of Bitbucket Server, which won't setup the quarantine environment. I'd also get the Field once, not every time you're going to fork a process. Etc. There's cleanup to be done.

As I said, this is pretty ugly. But it should work, and it should be reliable for 4.13.x and 4.14.x. As part of issue #51 you should be able to remove it all and just access the environment directly. So another option (which, to be clear, I'm not advocating for or encouraging; I'm just saying) would be to close this issue and document that Git 2.11+ is only supported when paired with Bitbucket Server 5.0. That's totally your call, of course. I'm just trying to help provide approaches you might use to solve this.

dakotahawkins commented 7 years ago

@bturner Thanks for taking the time to help, and for the awesome info!

Does the environment field include a ton of stuff, or just a few things? In other words, would it be worth copying every variable or do you think it would be better to rifle through them for the few that are related to the quarantine?

bturner commented 7 years ago

@dakotahawkins,

Since you'd be creating the CommandBuilder yourself, and not configuring any environment variables of your own, the only variables in the map will be the ones the system applies to all processes. While I can't offer a definitive list, I can say in modern Bitbucket Server versions (recent 4.x, 5.x), it's going to include:

(* Windows only) (** Git 2.11+ only)

Again, that list isn't necessarily definitive, but it's the "standard" set you're most likely to encounter. Of those, the one I think seems most likely to cause issues if the plugin applies it to the scripts it runs would be LC_ALL=C. It might be harmless, but it might not be, too. I'd trust your own judgment, based on your experience with how this plugin is used, more than mine!

It might be worth noting that Bitbucket Server's internal hook handling doesn't check Git versions at all, when handling quarantining. It looks for the presence of the GIT_QUARANTINE_PATH environment variable and, if it's present, it expects to find the full complement of quarantine variables. Otherwise, it no-ops out. I built it that way because I'm not sure how quarantining might change going forward. I didn't want to just assume that "2.11+ means quarantining." If you're going to copy specific environment variables, something similar is probably how I'd do it. Don't rely on a Git version check.

adfernandes commented 7 years ago

If it helps here is a list of the most common POSIX environment variables.

I would recommend passing LANG and the LC_* variables, as well as the TMP, TEMP, and TMPDIR at a minimum.

It might also be worth passing PATH as ORIGINAL_PATH or somesuch so that properly written applications can find alternate binaries, if required.

Major kudos and thanks, @bturner!

itay commented 7 years ago

@seletskiy any thoughts on whether this will be fixed?

seletskiy commented 7 years ago

@itay: Unfortunately, I no longer work with BitBucket Server and can't provide qualified response. Actually, only thing that I can do in that situation is to merge PRs and upload new version to marketplace.

@bturner, @dakotahawkins: Can you provide appropriate changes to fix that long-standing issue?

itay commented 7 years ago

I've created a PR which should fix this issue: https://github.com/ngsru/atlassian-external-hooks/pull/56

dploeger commented 5 years ago

Isn't this fixed by merging #56?