Closed jreadey closed 4 years ago
Changes to support config files are checked into jreadey-master. Posix, docker, and Kubernetes should all be supported.
If anyone can try out this branch before I merge into master, that would be appreciated.
Commit looks reasonable to me, I'll try to get some burn in time for this on DCOS.
@jreadey , I didn't expect that this would be exclusive. I.e., environment variables don't work at all as a configuration option now? I had thought that it would be either, or maybe both with a precedence of env variables.
Oh, I see now... The config.yml must exist in the config directory. It doesn't look like the paths align to find admin/config/config.yml in the docker build at the moment.
@s004pmg - there are 4 levels of config overrides (from lowest precedence to highest):
The problem with environment variables with Docker or Kubernetes is that they need to be explicitly passed in the docker-compose script or k8s yaml config. As the number of config keys increased, this got to be a bit tedious. So I've put most of the config option in config.yml and removed most of the ones in the yaml.
For kubernetes the config.yml is passed to the pods in a ConfigMap. I'm not exactly what the equivalent would be for DCOS.
Could you take a look at the changes in basenode.py? I made some changes in DCOS related code here, but don't have the ability to test it.
I'll check on the docker build now. This is for docker-compose.posix.yml?
I think that environment variables can be more common in DCOS, but either way works there.
I think I was too terse in my last message, here's the stack running a Docker container built off of master:
Traceback (most recent call last):
File "/usr/local/bin/hsds-datanode", line 8, in
So that's a new stack I got by swapping in a new build. My point is that if we're going to still allow folks to primarily configure via environment variables, then the docker build should produce a stock config of defaults at that location. We shouldn't force them to mount in a blank config if they prefer to configure via environment variables.
Ok - got it. Try out with this change: https://github.com/HDFGroup/hsds/commit/5b7a0f1fa1cd67d1fa789099f6bde89a183ca1ba. If /config/config.yml is not found, the server will pull from /etc/config/config.yml (part of the docker image)
That gets the nodes starting, but now on to a new problem, looks like the node["host"] isn't getting set, so the cluster flails and doesn't self-organize.
I.e., this code in the headnode healthcheck fires:
if node["host"] is None: fail_count += 1 log.warn("Node found with missing host information.") continue
When I dump the node JSON, I get this: {'node_number': 0, 'node_type': 'dn', 'host': None, 'port': None, 'id': None} It's been a while since I've been in there, but I don't remember that being a valid node definition.
We've been testing successfully using commit 532565f7bed8a5c5a966d8419d48d25acebc1363 for a few days now, I have no further concerns.
Are you still getting the strange state with node JSON?
Well, yes, though it gets past it. I get these in the head node for a while at start up:
WARN> Node found with missing host information.
Then it seems to go away after several minutes and the cluster finally turns ready (probably after several nodes turn over and restart). It's probably still worth debugging more because it may be delaying cluster startup, however it does settle in now.
I'll close this issue now. If anyone has questions/bugs with the config file usage, feel free to re-open
Rather than relying on environment variables, use config file for settings. This would be mounted (for docker) or loaded as a secret (Kubernetes).