having a few issues - Githubissues

jameseck commented 7 years ago

Hi,

I've been playing around with this and it's really very impressive, but I've noticed a few problems.

With 0.3.0 of dehydrated, the watcher pod was hanging while running dehydrated to perform the challenge response. I've forked the repo (https://github.com/jameseck/openshift-letsencrypt) and updated the dehydrated version in the Dockerfile to 0.4.0 and this has resolved the problem. This did lead to a trivial error where dehydrated is trying to call exit_hook() in the dehydrated callback, so I added this as a simple empty function for now. Let me know if you would like a PR for these changes.

Also, when the deployment is first rolled out, the cron container crashes a couple of times complaining about being unable to get a lock. Not really a big issue, but it looks a little messy when you are monitoring the pod status.

ibotty commented 7 years ago

Yes, please send a pull request.

The lock problem should be fixed, right. Catching and ignoring seems wrong, so what's the best way? I'll have to think about it. If you'd like to give input, please do!

jameseck commented 7 years ago

Perhaps the right approach is to just give cron some start delay, to let the watcher do its thing. I've only seen the container error during the first few seconds. I don't know what happens further down the line though. If a route is added or updated, will the cron container crash again? If so, perhaps cron should ignore locks if they are transient, so ignore a short-lived lock, but error on a stuck lock.

ibotty commented 7 years ago

I implemented flock-based locking around get_certificate. That way, cron (or watcher) should just wait when any other get_certificate is running. It might make sense to do finer grained locking.

jameseck commented 7 years ago

Forgot about this issue. I'm closing it since you've resolved the issue. Thanks :)

ibotty / openshift-letsencrypt

having a few issues #9