Closed deitch closed 7 years ago
We've added the service name to what we write to Consul via ContainerPilot here, but the management code also has to check the health. So I think we're going to need to inject this environment variable into that as well. Admittedly this is because of some close binding between the config and the code, so it'd be nice to clean this up anyways. It looks like this is predominantly a concern with the constant PRIMARY_KEY
:
libconsul.py
we use PRIMARY_KEY
where we try to determine if there's a healthy primary.libconsul.py
we use PRIMARY_KEY
to set a lock to mark which node is primary.manage.py
we use PRIMARY_KEY
to override that same lock. (Also here in that same function.)That variable is set via the environment's PRIMARY_KEY
variable in utils.py
. We can probably just swap that out for the SERVICE_NAME
right there, but I'd like to see that verified in the tests.
I think you are saying that, in addition to the service registration in the catalog, which this PR lets you change via SERVICE_NAME
, we also store which node of the (possibly multiple) nodes for the service is the primary using a simple key-value.
As long as there was just one service ("mysql"
) then we could have one key ("mysql-primary"
). But if we can have multiple services, determined by the SERVICE_NAME
env var, then we need to have multiple matching indications of primary, or else they all will try to use "mysql-primary"
and trounce each other.
Is that it?
If so, why KV instead of adding a "primary"
tag to the service when registering it in consul? If I recall correctly, the consul docs and examples for databases use tags that way explicitly. Mind you, that doesn't make it correct, but worth considering.
Is that it?
Yes!
If so, why KV instead of adding a "primary" tag to the service when registering it in consul?
Because we need an exclusive lock, and the tags don't allow for that as far as I understand.
Because we need an exclusive lock, and the tags don't allow for that as far as I understand. Yes, tags are not unique. No guarantees.
OK, so let's add this in as well.
Did a slight variant. Users may still want to override PRIMARY_KEY
, so:
PRIMARY_KEY = env('PRIMARY_KEY', env('SERVICE_NAME','mysql')+'-primary')
Or if you prefer from the README:
- `PRIMARY_KEY`: The key used to record a lock on what node is primary. (Defaults to `${SERVICE_NAME}-primary`.)
Added. The unit tests do fail, but they appear to be failures unrelated to this change.
The unit tests do fail, but they appear to be failures unrelated to this change.
If you rebase on master I'm pretty sure you'll be good-to-go with the unit tests. It would be a good idea to add a unit test that covers this code path to make sure we're not missing anything.
If you rebase on master I'm pretty sure you'll be good-to-go with the unit tests.
OK, trying now.
good idea to add a unit test that covers this code path
Indeed. I will look at it.
If you rebase on master I'm pretty sure you'll be good-to-go with the unit tests.
One error gone, one still there:
======================================================================
FAIL: test_update (__main__.TestContainerPilotConfig)
----------------------------------------------------------------------
Traceback (most recent call last):
File "test.py", line 870, in test_update
self.assertEqual(updated.read(), '')
AssertionError: '{"preStart": "python /usr/local/bin/manage.py", "tasks": [{"frequency": "5m", "command": "python /usr/local/bin/manage.py snapshot_task", "name": "snapshot_check", "timeout": "10m"}], "logging": {"level": "{{ if .LOG_LEVEL }}{{ .LOG_LEVEL }}{{ else }}INFO{{ end }}"}, "coprocesses": [{"restarts": "unlimited", "command": ["/usr/local/bin/consul", "agent", "-data-dir=/data", "-config-dir=/config", "-rejoin", "-retry-join", "my.consul.example.com", "-retry-max", "10", "-retry-interval", "10s"], "name": "consul-agent (host:{{ .CONSUL }})"}], "consul": "localhost:8500", "backends": [{"onChange": "python /usr/local/bin/manage.py on_change", "poll": 10, "name": "{{ if .SERVICE_NAME }}{{ .SERVICE_NAME }}{{ else }}mysql{{ end }}-primary"}], "services": [{"ttl": 25, "poll": 5, "health": "python /usr/local/bin/manage.py health", "name": "mysql", "port": 3306}]}' != ''
----------------------------------------------------------------------
I looked at the test.py file, I wasn't sure what it was doing:
def test_update(self):
self.environ['CONSUL_AGENT'] = '1'
cp = ContainerPilot()
cp.state = REPLICA
cp.load(envs=self.environ)
temp_file = tempfile.NamedTemporaryFile()
cp.path = temp_file.name
# no update expected
cp.update()
with open(temp_file.name, 'r') as updated:
self.assertEqual(updated.read(), '')
What exactly is it doing here? It is expecting the config file to be equal to '' ?
What exactly is it doing here? It is expecting the config file to be equal to '' ?
Right, because it's not expecting the file to be written-to if there was no update (as alluded to in the # no update expected
comment above. In the next step of the test we # force an update
and expect the file to be now populated.
I think I am beginning to see it. It looks like bin/manager/containerpilot.py
is a "poor man's processor". It does some template interpolation - just {{ .CONSUL_AGENT }}
- but not the rest. Since it compares self.config['services'][0]['name'] != self.state
to see if it should update, and since we set self.state
to REPLICA
or PRIMARY
(which bin/manager/utils.py
sets to mysql
or mysql-primary
), it is comparing them not to the correctly interpolated mysql
and mysql-primary
but instead to {{ if .SERVICE_NAME }}{{ .SERVICE_NAME }}{{ else }}mysql{{ end }}
, which obviously does not match.
Is the only path then to add more variable interpolation into containerpilot.py
? Is there a native library that does complete interpolation (short of rewriting in go)?
I think I am beginning to see it. It looks like bin/manager/containerpilot.py is a "poor man's processor". It does some template interpolation - just {{ .CONSUL_AGENT }} - but not the rest.
Right. This is intentional because we don't want to interfere with ContainerPilot's own rendering of the template, so we leave things like LOG_LEVEL
alone.
Is the only path then to add more variable interpolation into containerpilot.py? Is there a native library that does complete interpolation (short of rewriting in go)?
There are plenty of template libraries but none of them are going to figure out what the values of the variables are, which as you can see is the biggest part of that code section. Adding the dependency on jinja or whatever doesn't get us anything.
In that case, will add .SERVICE_NAME
interpolation to containerpilot.py
.
I actually started looking into this code base over the weekend. Why not do a PR against ContainerPilot to have a “containerpilot-template” render binary similar to “consul-template” without the daemon?
That way, we know the templates are being rendered exactly like the Containerpilot code expects.
Best, Patrick
On Jan 23, 2017, at 9:21 AM, Avi Deitcher notifications@github.com wrote:
In that case, will add .SERVICE_NAME interpolation to containerpilot.py.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/autopilotpattern/mysql/pull/77#issuecomment-274499789, or mute the thread https://github.com/notifications/unsubscribe-auth/AA_rVcS2Q6VmE7vK4-JG8kH2vatWoh-Aks5rVLdlgaJpZM4LaYtr.
I actually started looking into this code base over the weekend. Why not do a PR against ContainerPilot to have a “containerpilot-template” render binary similar to “consul-template” without the daemon?
I like that idea, but I think I would keep the same binary with a --template
mode, so you don't need to pull out the code, etc. In any case, every place you install a theoretical "containerpilot-template" binary you would install the "containerpilot" binary, so might as well.
This is really strange. I added the change to my own branch and tried to push out (so it would update this PR).
Got an "out of sync" message:
$ g push origin variable-service-name
To github.com:deitch/autopilotpattern-mysql.git
! [rejected] variable-service-name -> variable-service-name (non-fast-forward)
error: failed to push some refs to 'git@github.com:deitch/autopilotpattern-mysql.git'
hint: Updates were rejected because the tip of your current branch is behind
hint: its remote counterpart. Integrate the remote changes (e.g.
hint: 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.
That is even stranger, as the changes and the git log are identical... except for the sha hashes!
$ g remote -v
origin git@github.com:deitch/autopilotpattern-mysql.git (fetch)
origin git@github.com:deitch/autopilotpattern-mysql.git (push)
upstream git@github.com:autopilotpattern/mysql.git (fetch)
upstream git@github.com:autopilotpattern/mysql.git (push)
$ g log
commit 4475b2871d9cb938f27d15d81893eaf375263e7f
Author: Avi Deitcher <avi@deitcher.net>
Date: Tue Jan 24 12:06:12 2017 +0200
Add templating support for SERVICE_NAME
commit 3e0d972759e3f729dbb512eff2e8be3f7f6ea343
Author: Avi Deitcher <avi@deitcher.net>
Date: Thu Jan 19 14:52:19 2017 +0200
Make backend-name variable as well
commit 18ce1a157b642612a52da532f42456a85bb080fc
Author: Avi Deitcher <avi@deitcher.net>
Date: Thu Jan 5 19:45:07 2017 +0200
Update PRIMARY_KEY to default to SERVICE_NAME-primary
(truncated)
And in my branch:
Look at the commit hashes, they are different.
You might need to rebase onto the changes from upstream.
That's what I thought at first, but it didn't make sense. Rebasing changes which commits you branch off from. But here I have the same commits locally and on GitHub at origin
, with different hashes. How is that even possible?
But here I have the same commits locally and on GitHub at origin, with different hashes. How is that even possible?
Does seem weird. Particularly given that it's your fork, right? Do you have more than one remote configured maybe?
I may just need to copy the change from the last commit, wipe out and re-clone my local repo.
And so I did. OK, unit tests now pass. @tgross ?
LGTM!
Hey @tgross : about @sodre's comment:
I actually started looking into this code base over the weekend. Why not do a PR against ContainerPilot to have a “containerpilot-template” render binary similar to “consul-template” without the daemon?
That way, we know the templates are being rendered exactly like the Containerpilot code expects.
I was thinking more giving the containerpilot binary an option to just convert templates. Since it already has to exist in the image, why not be able to do:
containerpilot --template --in /etc/containerpilot.json --out /some/out/path
(probably with a default --out
to stdout and --in
from /etc/containerpilot.json
)
I don't mind putting in a PR for it... but not troubling myself unless there is desire for it.
Hi @deitch, I started the PR yesterday here joyent/containerpilot#271
Per the discussion in https://github.com/joyent/containerpilot/issues/260 and @tgross request at https://github.com/joyent/containerpilot/issues/260#issuecomment-270137432
Still defaults to a service name of
"mysql"
, but can be overridden by the env varSERVICE_NAME
.This allows one to use the exact same image in the same environment for different services, e.g. having three mysql instances for
authdb
, 2 more forordersdb
, etc.