Closed matthewvalimaki closed 8 years ago
I've updated user-consul-nginx-nodejs
to show how consul-template is used as it was the only one where we actually have two services in one container.
Forgot to mention:
exit
did not do it./etc/nginx/conf.d/default.conf
. I would prefer having a group which then is given permission to specific files. Perhaps use existing consul user and group?@matthewvalimaki, to make this super complete I think it might need a consul watch
implementation, https://www.consul.io/docs/commands/watch.html. Having the benefit that configuration could be re-run when the nodejs.app IP changes? What do you think? Are you interested in adding that?
@matthewvalimaki, btw, thanks for this. It's going to be a great addition. I'm happy to help add the extra bits and pieces if you like.
@smebberson No need for consul watch
as consul-template
already handles this. So if IP, port changes or new nodes with same service and tag become available /etc/nginx/conf.d/default.conf
will be automatically rewritten and nginx through s6 restarted.
And no problem, I like these Docker images and use them myself and want to improve them so that I have less to worry about downstream :)
@smebberson Improvements pushed, see https://github.com/matthewvalimaki/docker-alpine/commit/ae7469196628501ad295e1ed14ae3ddd80328744.
@matthewvalimaki, these changes look great. Given the scenario we're faced with, the loop you have implemented should work fine. To implement notifications, well, I think it will require some effort from Consul's part. But I think we should raise it as a bug. Either consul-template
should not throw an error but repeat connection attempts for a specified amount of time, or consul
should repeat when it is not only up, but ready to start receiving requests.
I wasn't aware that consul-template
already does the job of consul watch
also. That is great.
Thanks again. I'll look to merge this in ASAP.
@smebberson actually I believe by using consul watch
, something you brought up, and its Type: nodes
we probably could remove the loop I introduced and clean up consul-template run
. Consul Watches documentation for Type: nodes
states the following:
The "nodes" watch type is used to watch the list of available nodes. It has no parameters.
Then in combination with consul handler executing a shell script we should be able to know when consul agent
is in usable state.
alpine-consul
and alpine-consul-base
would see the following new file introduced in /etc/consul.d/_watch-node.conf
and /etc/consul.d/bootstrap/_watch-node.conf
. Actually the file should be in /etc/consul.d/watches/_watch-node.conf
and then symlinked to mentioned paths:
{
"watches": [
{
"type": "nodes",
"handler": "/usr/sbin/consul-nodes-handler.sh"
}
]
}
consul-nodes-handler.sh would then be responsible for:
1) Checks if templates exist.
2) Remove /etc/services.d/consul-template/down
if templates did exist.
3) Executes s6-svc -u /var/run/s6/services.d/consul-template
to start consul-template.
4) rm symlink
5) consul reload
to reload configuration with now removed watch.
Unfortunately as it is https://github.com/smebberson/docker-alpine/blob/master/alpine-consul-base/root/etc/services.d/consul/run is recursively set to go through /etc/consul.d/
so it would load up /etc/consul.d/watches
every time. @smebberson what if we rename /etc/consul.d
to /etc/consul
and then have all default configurations in /etc/consul/conf.d
? That way we could have /etc/consul/watches
and do the symlink I mentioned above.
/etc/cont-finish.d/01-consul-template
would re-create the symlink and down
removed by handler.
consul reload
in the handler we ensure that the handler gets executed only once.@matthewvalimaki, yep, I like the sound of that. I think that would be a great addition and tidy things up nicely! Great suggestion.
@smebberson I've updated code to reflect what was discussed earlier with one exception:
Instead of /etc/cont-finish.d/01-consul-template
I opted to use /etc/cont-init.d/00-consul-template
to guarantee that necessary down
and symlink are created in case services are not finished properly.
@matthewvalimaki, sorry, I didn't realise this was ready for review again. I assume it is?
No problem and yes it is ready again. A side note: would it be easy to implement tests for these? Haven't looked into that yet at all but I would like to work on those. On Dec 12, 2015 1:06 PM, "Scott Mebberson" notifications@github.com wrote:
@matthewvalimaki https://github.com/matthewvalimaki, sorry, I didn't realise this was ready for review again. I assume it is?
— Reply to this email directly or view it on GitHub https://github.com/smebberson/docker-alpine/pull/13#issuecomment-164191735 .
@smebberson apologies for delayed response. My only reasoning for having consul-template
in alpine-consul
was the fact that consul
itself was available there even though it it is in server mode. I do not have a use case right now but something that I've been looking at is https://github.com/hashicorp/consul-replicate, which perhaps should be part of the alpine-base
. If consul-replicate
is introduced then potentially consul-template
might have a use case :)
I do not have a strong opinion where consul-template
should be, so if you want I can move it to alpine-consul-base
it's not a big deal really.
@matthewvalimaki, I think it would be great if consul-template
lived in alpine-consul-base
at this stage. I'd like to keep alpine-consul
as lightweight and simple as possible. Are you able to move it there? The only reason this hasn't been merged into master yet (and then onto your other PRs) is because I've been struggling to find the time to do it myself.
@smebberson no problem, I've now updated the PR accordingly. Also updated consul-template
to latest release.
Note that this PR introduces probably noteworthy change: root/etc/consul.d
was renamed to root/etc/consul
. consul
configurations are now expected to be in root/etc/consul/conf.d/
. This was done to allow introduction of consul watch via json config.
@smebberson I've noticed issues with the sh scripts, trying to track down the issue so do not merge or try to fix it on your own. Noticed these issues only after trying to implement consul-template
with Nomad.
So one issue is that the script that consul watch
is executing wants to s6-svc -u /var/run/s6/services/consul-template
however as consul
is running as user consul
it has no permissions to command services, see
Problem seems to be that /var/run/s6/services/consul-template
is owned root:root
and doing chmod -R 777 /var/run/s6/services/consul-template
does seem to solve the issue.
Having consul
watches control consul-template
service like this I think should be ok. So perhaps we just change /var/run/s6/services/consul-template
permissions?
@smebberson your thoughts on this?
In alpine-consul-base
we could modify /etc/services.d/consul/run
to include chown -R consul:consul /var/run/s6/services/consul-template
. I tested this and it would seem to work.
Thanks for putting this together! I was just looking at doing something similar with these images :grinning:
I've been looking into s6 a bit and it looks like everything can be left read-only and owned by root, except for the supervise
subdirectory, which needs to be writable by the user that's running s6-svc
.
@reidab Thanks for the comment, I didn't even think about the subdirs.
So perhaps this is how the PR should be modified:
/var/run/s6/services/*/supervise
to be owned by whatever is defined with s6-setuidgid
in run
OR add a new group for just for this purpose. New group might provide additional security in terms of not allowing service to self manage.consul-template
user which belongs to group that owns whatever is going to be overwritten. For example in user-consul-nginx-nodejs
user consul-template
should be part of group that owns /etc/nginx/default.conf
.@matthewvalimaki, I agree with your comments. We should create a group that has necessary permissions for writing files in the necessary spots.
The place to alter directory permissions is in the fix-attrs.d
directory. Although, this only allows changing owner. With that, it might be good to create a file in cont-init.d
to change permissions as required. I think this is cleaner than running it in run scripts.
I do wonder what is the best practice here however, as far as s6 is concerned. I might ask the s6-overlay guys and see what I get back...
@smebberson I'll do those group changes.
@smebberson according to init-stages stage 2 i. /etc/fix-attrs.d
happens before stage 2. iii which is where copying happens. I have not tested this but I would think the problematic dir in question /var/run/s6/services/*/supervise
would still be root:root
.
@smebberson to me it feels like s6-overlay should have 4th step in stage 2 after copying files.
@smebberson consul-template
user and group added. I've updated user-consul-nginx-nodejs
example to make use of this change, see the Dockerfile.
Running consul-template
as non-root does cause another problem: reloading of services.
Having consul-template
belong to the same group as nginx
so that consul-template
can rewrite configuration works. But in order for that configuration to be loaded up one would need to tell s6
to reload nginx
leading us back to the problem of /var/run/s6/services/*/supervise
belonging to root:root
.
Right now this problem is solved for consul-template
in /etc/services.d/consul/run
where I made it chown -R consul:consul /var/run/s6/services/consul-template
. Unless s6-overlay
adds something to manage this I think the best place is indeed in services.d/run
.
To make things cleaner we could introduce a new group which owns, along with root, all /var/run/s6/services/*/supervise
so that users such as consul-template
by belonging to that group could easily manage services. This way in all run
scripts where we use s6-setuidgid
we should also modify the /var/run/s6/services/*/supervise
.
@matthewvalimaki, have you seen the response in https://github.com/just-containers/s6-overlay/issues/130 ? This goes along with our thinking. Let's set it up as you've stated:
Introduce a new group which owns, along with root, all /var/run/s6/services/*/supervise
so that users such as consul-template
by belonging to that group could easily manage services. This way in all run scripts where we use s6-setuidgid
we should also modify the /var/run/s6/services/*/supervise
.
I would like to some how keep this out of the individual run scripts if possible, do you agree? I agree with your sentiment about a 4th stage. I'll follow up in the same issue on s6-overlay and see what happens.
@matthewvalimaki, it seems as though we should be able to set everything up in some cont-init.d
scripts however? Reading http://skarnet.org/software/s6-portable-utils/s6-hiercopy.html, it says owner and group are preserved. Although, it feels like this setup stuff should be in alpine-base
?
@matthewvalimaki, are you still keen on pushing this one through?
Yes. I am currently on vacationing causing the delay. On Jan 27, 2016 12:49 AM, "Scott Mebberson" notifications@github.com wrote:
@matthewvalimaki https://github.com/matthewvalimaki, are you still keen on pushing this one through?
— Reply to this email directly or view it on GitHub https://github.com/smebberson/docker-alpine/pull/13#issuecomment-175276684 .
@smebberson I tested the idea of setting permissions on /etc/services.d/consul
to root:s6
(s6
being the new group) and s6-hiercopy
does seem to retain ownership. Unfortunately s6-svc -h /var/run/s6/services/consul/
threw permission denied
and code 111
. Once I changed ownership to consul:s6
the service restart worked. Problem seems to be that s6
requires user to be defined (in this case consul
) and group is irrelevant. Adding group write permission seems to do the trick. I followed up on your comment at https://github.com/just-containers/s6-overlay/issues/130#issuecomment-181554343.
Unfortunately /var/run/s6/services/consul/supervise/
is filled after cont-init.d
therefore any permission fixes we might want to do would have to be done from run
.
Okay, cool. No worries, thanks for following that up. So, is everything ready in this PR to be merged?
@matthewvalimaki, ping! Let me know if this is ready to go. I want to get this merged and then resolve the other issues on this repo. Thanks!
@smebberson pong! Sorry for the delay. This PR has existed far too long. Not only has it held back new releases but it's hard to capture all that has been discussed even when those are written above.
Please review if Dockerfile
could be improved, the first RUN seems a bit heavy to me.
All changes were made to alpine-consul-base
:
s6
usergroup.
consul
and consul-template
users belong to this group./etc/services.d/consul-template/supervise/control
is created in Dockerfile
.chmod g+w /etc/services.d/consul-template/supervise/control
is done in Dockerfile
.alpine-consul-base/README.md
. Any and all services that consul-template
is expected to manage through s6-svc
need to have s6
group as the owner of this particular file.chmod o-w /etc/services.d/consul-template/down
is done in Dockerfile
to prevent potential security issue as anybody could remove that file.@matthewvalimaki, this looks good. I'm going to pull this down, work it up a little more and then merge it into master. We can always improve on things along the way. My gut feel is this is a little heavy straight in alpine-consul-base
but I'm more than happy with the overall capability this brings and I'm happy to see how it goes.
I'll get onto this first thing tomorrow morning (don't want to start this so late in the day here).
@smebberson sounds good. Please do whatever changes you want. I just want
consul-template
to be available :) Once we have this merged I will
provide PR for nomad
.
On Tue, Mar 1, 2016 at 1:29 AM, Scott Mebberson notifications@github.com wrote:
@matthewvalimaki https://github.com/matthewvalimaki, this looks good. I'm going to pull this down, work it up a little more and then merge it into master. We can always improve on things along the way. My gut feel is this is a little heavy straight in alpine-consul-base but I'm more than happy with the overall capability this brings and I'm happy to see how it goes.
I'll get onto this first thing tomorrow morning (don't want to start this so late in the day here).
— Reply to this email directly or view it on GitHub https://github.com/smebberson/docker-alpine/pull/13#issuecomment-190631945 .
@smebberson understanding that this + other issues cause quite a backlog but I was wondering how is it going? Do you need help with anything? I rather help you with this project than start maintaining my fork so let me know if there's anything I could do. I have quite a few things depending on these images :)
@matthewvalimaki. Sorry! Working on it now ;)
@matthewvalimaki, I'm getting some weird errors and I can't get it to run. I guess you weren't experience anything like the below?
Consul Template returned errors:
open /etc/nginx/conf.d/186949828: permission denied
I don't have heaps of experience with consul-template so I'm not sure what it would be doing. Or even why it would be looking in that directory?
Any ideas?
@matthewvalimaki, I managed to get past that. Running into a few other issues now. But I'm working through them.
Sorry, sounds like I made a broken or :(
A side note: remember that we can use pre-built consul binary as of 0.6. On Mar 17, 2016 6:35 PM, "Scott Mebberson" notifications@github.com wrote:
@matthewvalimaki https://github.com/matthewvalimaki, I managed to get past that. Running into a few other issues now. But I'm working through them.
— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/smebberson/docker-alpine/pull/13#issuecomment-198154647
@matthewvalimaki, take a look at https://github.com/smebberson/docker-alpine/tree/matthewvalimaki-add-consul-template. That is my integration branch. Everything seems to be running smoothly now. There was a number of issues due to permissions not allowing certain things to occur (such as removing the consul nodes watch symlink - you didn't experience anything like that?).
The only thing I'm not happy with is the weird permissions I had to have on the /etc/nginx
directory. I had to set them to 674 (??) so that consul-template could write to the directory. When I run with 664 I get:
Consul Template returned errors:
open /etc/nginx/conf.d/default.conf: permission denied
Despite the fact that consul-template
has been added to the nginx
group? Can you see why that would happen?
If you want to run this locally, you'll have to build (in this order):
alpine-consul-base
alpine-consul-nginx
alpine-consul-nginx-nodejs
examples/user-consul-nginx-nodejs
If you could give me your input on that - that'd be great. I'll wait to merge into master until you get back to me.
Also, before I merge into master, I'll release v1.0.0 of alpine-consul-nginx
, and alpine-consul-nginx-nodejs
and then merge those into my integration branch so that people wanting to keep the old versions can.
@smebberson please merge #12 while you're at working on nodejs :) I'll think about your questions now and see if I can come up with something. Thanks for the cleanup.
@smebberson going back to your last questions:
There was a number of issues due to permissions not allowing certain things to occur (such as removing the consul nodes watch symlink - you didn't experience anything like that?).
I did! And I thought I pushed up those fixes, very similar to what you did. I'm sorry, I should've made a clean clone to test it again. The symlink was one of the very first permissions issues I encountered.
Do you still have similar issues elsewhere?
The only thing I'm not happy with is the weird permissions I had to have on the /etc/nginx directory. I had to set them to 674 (??) so that consul-template could write to the directory.
I did not see that. Probably having Windows as host causes some issues for permissions for me. As the file you mentioned, /etc/nginx/conf.d/default.conf
, is added when building I suspect the ownership or permissions being incorrect.
@matthewvalimaki this is finally done! I've version bumped and Git tagged, and Docker hub tagged alpine-consul-base
, alpine-consul-nginx
and alpine-consul-nginx-nodejs
. All seem to be working well. Thanks for your PR!
@smebberson awesome! Good job! Now I get to provide Nomad and other improvements. These images are by far the best with Hashi stuff :) On Mar 22, 2016 4:41 PM, "Scott Mebberson" notifications@github.com wrote:
@matthewvalimaki https://github.com/matthewvalimaki this is finally done! I've version bumped and Git tagged, and Docker hub tagged alpine-consul-base, alpine-consul-nginx and alpine-consul-nginx-nodejs. All seem to be working well. Thanks for your PR!
— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/smebberson/docker-alpine/pull/13#issuecomment-200081079
Signed-off-by: Matthew Valimaki matthew.valimaki@gmail.com