Closed ambsw-technology closed 7 years ago
Dependency resolution is part of Docker-Map and has always been recursive. However, the consistency check between containers has been improved over time. The changes from 0.6.6
to 0.7.0
fixed some bugs there that were hard to trace, but mostly affected already-running containers.
I have had issues before similar to what you are describing. There were actually race conditions and did not show every time. I had the impression this tended to occur more often on dependencies with a particularly long startup time.
A fairly simple solution would be adding some startup delay to the dependency. I can add that as a feature to Docker-Map. Checking if the container is running & idle would be more complicated, and potentially not even be reliable.
I definitely need to check again once I resolve the 0.7.0
connection issue. In 0.6.6
I'm not seeing the right DEBUG
messages. Specifically, all it's trying to do is:
There's no effort to start elasticsearch or mongo (the dependencies) first so it doesn't seem to be a race condition. However, I'll close until I can replicate on 0.7.0
Even if you cannot reproduce this in 0.7.0
, could you post the contents of the ContainerMap (removing potentially sensitive parts)? I could then check if there is something wrong with the dependency resolution, or just the container status check and improve tests as necessary.
I was travelling all day so I didn't get a chance to work out all the kinks, but I assume the following would be a minimal example:
from dockermap.map.container import ContainerMap
from dockermap.map.input import ContainerLink
from dockerfabric.apiclient import docker_fabric, container_fabric
maps = ContainerMap('test_map', {
'host_root': '~',
'u1': {
'image': 'ubuntu:latest',
},
'u2': {
'image': 'ubuntu:latest',
'links': [
ContainerLink("u1", "u1"),
],
},
'u3': {
'image': 'ubuntu:latest',
'links': [
ContainerLink("u2", "u2"),
],
},
})
docker_client = docker_fabric()
container_fabric_inst = container_fabric(docker_client=docker_client, container_maps=maps)
# Explicitly start u3 first to test multiple dependencies
container_fabric_inst.startup('u3')
When I run this on 0.3.10/0.6.6 I get the following exception:
docker.errors.APIError: 500 Server Error: Internal Server Error ("{"message":"invalid hostname format: test_map.u2"}")
When I run it on 0.4.0/0.7.0, I get the error:
docker.errors.DockerException: Invalid bind address protocol: http+docker://localunixsocket
I'm offline most of tomorrow too. Obviously, I'll figure out how to clean this up if I don't hear that you've gotten past these hiccups.
Good call in #12. I inherited this code so I didn't realize docker_client
was optional. I'm now getting the same invalid hostname
on both.
I'm moving invalid hostname
to a separate issue. Obviously, it's blocking progress on replicating the recursive dependency.
Here's the latest test code resolving #12 and #14:
from dockermap.map.container import ContainerMap
from dockermap.map.input import ContainerLink
from dockerfabric.apiclient import docker_fabric, container_fabric
maps = ContainerMap('test', {
'host_root': '~',
'u1': {
'image': 'ubuntu:latest',
},
'u2': {
'image': 'ubuntu:latest',
'links': [
ContainerLink("u1", "u1"),
],
},
'u3': {
'image': 'ubuntu:latest',
'links': [
ContainerLink("u2", "u2"),
],
},
})
container_fabric_inst = container_fabric(container_maps=maps)
# Explicitly start u3 first to test multiple dependencies
container_fabric_inst.startup('u3')
Now I'm able to reproduce the originally reported issue (in a python shell):
>>> container_fabric_inst.startup('u3')
docker: Creating container 'test.u2' from image 'ubuntu:latest'.
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "/mnt/data/.virtualenvs/build_deploy/local/lib/python2.7/site-packages/dockermap/map/client.py", line 227, in startup
return self.run_actions('startup', container, instances=instances, map_name=map_name, **kwargs)
File "/mnt/data/.virtualenvs/build_deploy/local/lib/python2.7/site-packages/dockermap/map/client.py", line 113, in run_actions
results.extend(runner.run_actions(*actions))
File "/mnt/data/.virtualenvs/build_deploy/local/lib/python2.7/site-packages/dockermap/map/runner/__init__.py", line 94, in run_actions
res = c_method(action_config, container_name, **action.extra_data)
File "/mnt/data/.virtualenvs/build_deploy/local/lib/python2.7/site-packages/dockermap/map/runner/base.py", line 60, in create_instance
return config.client.create_container(**c_kwargs)
File "/mnt/data/.virtualenvs/build_deploy/local/lib/python2.7/site-packages/dockerfabric/apiclient.py", line 159, in create_container
return super(DockerFabricClient, self).create_container(image, name=name, **kwargs)
File "/mnt/data/.virtualenvs/build_deploy/local/lib/python2.7/site-packages/docker/api/container.py", line 135, in create_container
return self.create_container_from_config(config, name)
File "/mnt/data/.virtualenvs/build_deploy/local/lib/python2.7/site-packages/docker/api/container.py", line 146, in create_container_from_config
return self._result(res, True)
File "/mnt/data/.virtualenvs/build_deploy/local/lib/python2.7/site-packages/docker/client.py", line 178, in _result
self._raise_for_status(response)
File "/mnt/data/.virtualenvs/build_deploy/local/lib/python2.7/site-packages/docker/client.py", line 174, in _raise_for_status
raise errors.APIError(e, response, explanation=explanation)
docker.errors.APIError: 500 Server Error: Internal Server Error ("{"message":"Could not get container for test.u1"}")
... so it's correctly building u2
before u3
but not recursively checking the dependencies of u2
(to start u1
). I'm working around the issue by providing an explicit start order so it's not an urgent issue for me.
I have found the problem now. It was not that dependencies were not followed recursively, but they were processed in the wrong order. At some point (probably a while before Docker-Map 0.6.6) code got changed between the dependency resolution and the code that used it, causing this error. As there were only tests for the dependency resolution itself, this went unnoticed for a bit. Tests added in Docker-Map 0.7.0 ignored the order. This will be fixed in the next release of Docker-Map, and I have also added some tests for reproducing the setup.
Fixed with Docker-Map release 0.7.2.
While debugging #12, I rolled back to
docker-fabric==0.3.10
anddocker-map==0.6.6
. I ran into the following issue:docker.errors.APIError: 500 Server Error: Internal Server Error ("{"message":"Cannot link to a non running container: /graylog_map.elasticsearch AS /graylog_map.graylog/elasticsearch"}")
In this case, I have a two-layer dependency:
nginx => graylog => elasticsearch & mongo
. When I didn't havenginx
it worked correctly. When I manually coded the launch order, it worked correctly. I assume, therefore, that0.3.10
is not recursively checking dependencies in thestart
logic.NOTE: I can't replicate the issue in
0.4.0
until #12 is resolved so it could (at least theoretically) be fixed.