deis / builder

Git server and application builder for Deis Workflow
https://deis.com
MIT License
40 stars 41 forks source link

fix(pkg/sshd): wrap git receive functionality in a locking utility fu… #339

Closed arschles closed 8 years ago

arschles commented 8 years ago

Summary of Changes

This patch wraps the logic that runs the git receive hook in a utility function which calls defer pushLock.Unlock() so that, even if a git push fails, the lock gets unlocked.

Associated End to End Tests

Created https://github.com/deis/workflow-e2e/issues/207 to track the work required to test this new behavior

Issue(s) that this PR Closes

Closes #309 Note that this patch doesn't attempt to fix https://github.com/deis/builder/issues/220

Testing Instructions

  1. Create a Deis Cluster, specifying a dockerbuilder and slugbuilder image that doesn't exist (see here)
  2. Register an app
  3. Add a key to your app
  4. git push deis master from a repository
  5. Cancel the git push while it's running by pressing ctrl+c

After step 5, you should be able to do another git push without getting an error, but you may have to wait a few seconds.

Pull Request Hygiene TODOs

Please make sure the below checklist is complete.

arschles commented 8 years ago

Tested with the above testing instructions and verified that git pushes can be interrupted and subsequent git pushes succeed

bacongobbler commented 8 years ago

excellent cleanup. Has anyone else manually tested this? considering that it's covered by unit tests I'd be okay with merging this, but just wondering if anyone else has tested this. :)

arschles commented 8 years ago

@bacongobbler I'd love for more testing, so if you or anyone else has a chance to, I'm happy to wait before merging this PR

bacongobbler commented 8 years ago

I have a cluster here so I'll manually test this.

bacongobbler commented 8 years ago

:(

><> git push deis master
Counting objects: 80, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (45/45), done.
Writing objects: 100% (80/80), 19.02 KiB | 0 bytes/s, done.
Total 80 (delta 30), reused 80 (delta 30)
Starting build... but first, coffee!
...
^C
><> git push deis master
exec request failed on channel 0
fatal: remote error: Another git push is ongoing
><> git push deis master
exec request failed on channel 0
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
bacongobbler commented 8 years ago

Hold up... if I wait a little longer pushes work again.

arschles commented 8 years ago

@bacongobbler how long did you wait until pushes worked again?

bacongobbler commented 8 years ago

about 20 seconds or so. Not that long.

arschles commented 8 years ago

Hm, ok. I think what happened was that the lock didn't release until after the builder pod had finished. Since the git-receive hook runs in a new process, we have no mechanism right now for notifying it that the SSH channel was closed. I've created https://github.com/deis/builder/issues/345 to track that work.

Given my failure to describe properly what this PR fixes, my testing instructions were incorrect - they should have described a way to make the slugbuilder pod fail to complete. I will update them now

bacongobbler commented 8 years ago

they should have described a way to make the slugbuilder pod fail to complete.

I have a buildpack just for that. :)

arschles commented 8 years ago

Perfect - I actually tried with an invalid slugbuilder image name and the builder was able to recover and allow new git pushes after a short period of time. However, I noticed that the git-receive hook has issues reliably detecting when a builder pod fails to run to completion (for example, when it goes into ErrImagePull) - possibly related to #298.

The hook appears to eventually fail out, and the delay might be because it polls the state of the builder pod on an interval. Anyway, the fact that this PR allows the builder to heal itself is progress at least ;)