merll / docker-fabric

Integration of Docker deployments into Fabric.
MIT License
79 stars 10 forks source link

CLI Client Error Parsing #8

Closed ambsw-technology closed 7 years ago

ambsw-technology commented 7 years ago

+1 on the experimental CLI client

I'm testing it and ran into the following issue.

Traceback (most recent call last):
  File "/mnt/data/.virtualenvs/build_deploy/local/lib/python2.7/site-packages/fabric/main.py", line 756, in main
    *args, **kwargs
  File "/mnt/data/.virtualenvs/build_deploy/local/lib/python2.7/site-packages/fabric/tasks.py", line 426, in execute
    results['<local-only>'] = task.run(*args, **new_kwargs)
  File "/mnt/data/.virtualenvs/build_deploy/local/lib/python2.7/site-packages/fabric/tasks.py", line 173, in run
    return self.wrapped(*args, **kwargs)
  File "/mnt/data/apex-server/build_deploy/src/fabfile.py", line 897, in setup_service_graylog
    startup_required_containers(db_client)
  File "/mnt/data/apex-server/build_deploy/src/fabfile.py", line 1908, in startup_required_containers
    container_fabric_inst.startup(container)
  File "/mnt/data/.virtualenvs/build_deploy/local/lib/python2.7/site-packages/dockermap/map/client.py", line 227, in startup
    return self.run_actions('startup', container, instances=instances, map_name=map_name, **kwargs)
  File "/mnt/data/.virtualenvs/build_deploy/local/lib/python2.7/site-packages/dockermap/map/client.py", line 110, in run_actions
    for states in state_generator.get_states(map_name or self._default_map, config_name, instances=instances):
  File "/mnt/data/.virtualenvs/build_deploy/local/lib/python2.7/site-packages/dockermap/map/state/base.py", line 221, in get_dependency_states
    client_names=client_names, is_dependency=True):
  File "/mnt/data/.virtualenvs/build_deploy/local/lib/python2.7/site-packages/dockermap/map/state/base.py", line 118, in generate_config_states
    instance_states = [i_state for i_state in _get_state(config_flags, instances)]
  File "/mnt/data/.virtualenvs/build_deploy/local/lib/python2.7/site-packages/dockermap/map/state/base.py", line 112, in _get_state
    client, item, c_flags)
  File "/mnt/data/.virtualenvs/build_deploy/local/lib/python2.7/site-packages/dockermap/map/state/base.py", line 61, in get_container_state
    if container_name in self._policy.container_names[client_name]:
  File "/mnt/data/.virtualenvs/build_deploy/local/lib/python2.7/site-packages/dockermap/map/policy/cache.py", line 124, in __getitem__
    return self.refresh(item)
  File "/mnt/data/.virtualenvs/build_deploy/local/lib/python2.7/site-packages/dockermap/map/policy/cache.py", line 137, in refresh
    val = self.item_class(client)
  File "/mnt/data/.virtualenvs/build_deploy/local/lib/python2.7/site-packages/dockermap/map/policy/cache.py", line 15, in __init__
    self.refresh()
  File "/mnt/data/.virtualenvs/build_deploy/local/lib/python2.7/site-packages/dockermap/map/policy/cache.py", line 93, in refresh
    current_containers = self._client.containers(all=True)
  File "/mnt/data/.virtualenvs/build_deploy/local/lib/python2.7/site-packages/dockerfabric/cli.py", line 77, in containers
    return parse_containers_output(res)
  File "/mnt/data/.virtualenvs/build_deploy/local/lib/python2.7/site-packages/dockermap/client/cli.py", line 166, in parse_containers_output
    _container_info(line) for line in out.splitlines() or ()
  File "/mnt/data/.virtualenvs/build_deploy/local/lib/python2.7/site-packages/dockermap/client/cli.py", line 72, in _container_info
    'Image': items[1],
IndexError: list index out of range

The problem is that (the repr of) the "resource" being returned is the following:

'invalid value "http://127.0.0.1:22026" for flag -H: Invalid bind address format: http://127.0.0.1:22026\r\nSee \'docker --help\'.'

Obviously, I'm going to do some digging to see if I can resolve, but I wanted to get the report out there ASAP in case you could shortcut the search.

ambsw-technology commented 7 years ago

I debugged (repr again) the command being run:

docker -H http://127.0.0.1:22026 ps --no-trunc --format="{{.ID}}||{{.Image}}||{{.CreatedAt}}||{{.Status}}||{{.Names}}||{{.Command}}||{{.Ports}}" --all=true

When I run this manually/locally on the target, I get two issues:

  1. The -H flag does not accept http://
  2. The address provided in -H doesn't actually work for me
ambsw-technology commented 7 years ago

This call is clearly using the CLI (based on the stack trace). However, I discovered that an earlier call was statically linked to the API client. I'll update you once I resolve the first call to figure out if the issues are interrelated.

ambsw-technology commented 7 years ago

This problem persists after fixing the statically linked API client so it's not a config-poisoning issue.

ambsw-technology commented 7 years ago

I'm not sure when, but the error has changed slightly because the command is now:

docker -H http+docker://localunixsocket ps --no-trunc --format="{{.ID}}||{{.Image}}||{{.CreatedAt}}||{{.Status}}||{{.Names}}||{{.Command}}||{{.Ports}}" --all=true

Naturally, the -H flag is still wrong (and unnecessary) but this looks "better" since it's not answering like it's trying to write to a socat port.

merll commented 7 years ago

It seems that there need to be some modifications to the -H argument. However, it is not entirely clear to me how the URL ends up this way. Could you list which env variables starting with docker_ are set and to what values, and which keyword arguments you pass in to container_fabric() or container_cli(), if any?

ambsw-technology commented 7 years ago

I was in the middle of documenting the sequence of events when you replied so I went ahead and finished. I've "flattened" a bunch of function calls, but roughly:

env.docker_fabric_implementation = CLIENT_CLI

docker_client = api.docker_client()
# Debug logs show CLI Client is created correctly.  
# in get_connection():
#  env.get('host_string') = '<ip of target>'
#  kwargs.get('base_url') = None
#  env.get('docker_base_url') = None
for container in env.docker_maps.containers:
    docker_client.pull(env.docker_maps.containers[container].image)
    # These all run correctly and the images are pulled
container_fabric_inst = api.container_fabric(docker_client=docker_client, container_maps=[])
for container in env.docker_maps.containers:
    # I don't hit any CLI constructor log messages at this point
    container_fabric_inst.startup(container)
    # Somewhere in here the CLI Client is recreated with the wrong parameters
    # in get_connection():
    #  env.get('host_string') = '<ip of target>'
    #  kwargs.get('base_url') = 'http+docker://localunixsocket'
    #  env.get('docker_base_url') = None
ambsw-technology commented 7 years ago

corrected inline above, this happens in "startup"

edit: it happens in both but the error in upgrade() happens when it calls docker -H http+docker://localunixsocket pull --insecure-registry=false tianon/true:latest

ambsw-technology commented 7 years ago

... and my shell has no docker_ env variables

ambsw-technology commented 7 years ago

I don't know if it helps narrow down, but these are (slightly sanitized and annotated) messages around the bad CLI construction:

DEBUG:dockermap.map.client:Passing kwargs to client actions: {}
DEBUG:dockermap.map.state.base:Following dependency path for graylog_map.nginx.
DEBUG:dockermap.map.state.base:Dependency path at graylog_map.graylog, instances [None].
<< in get_connection
'172.24.2.11' << env.get('host_string')
'http+docker://localunixsocket'  << kwargs.get('base_url')
None << env.get('docker_base_url')
<< CLI init here
DEBUG:docker.auth.auth:Trying paths: ['/home/<domain>/<user>/.docker/config.json', '/home/<domain>/<user>/.dockercfg']
DEBUG:docker.auth.auth:Found file at path: /home/<domain>/<user>/.docker/config.json
DEBUG:docker.auth.auth:Found 'auths' section
ambsw-technology commented 7 years ago

OK. Looks like the offender is in dockermap.map.state.base line 116. This client_config is generated as part of the clients in 105 or 107. These are generated by the _policy.

I see a policy created in dockermap.map.client on line 102, but am still trying to chase it through the logic to confirm. Perhaps you know definitely either way (and can shortcut my search).

ambsw-technology commented 7 years ago

Removing the docker.Client check in #10 (and not adding it to the CLI class) has resolved the -H issue. The next issue I ran into was the CREATED_AT_PATTERN regex. Specifically, my time component was:

2016-12-06 18:09:58 -0500 EST

The regex pattern doesn't like the - timezone and can be fixed as:

CREATED_AT_PATTERN = re.compile(r'(\d{4})-(\d{2})-(\d{2}) (\d{2}):(\d{2}):(\d{2}) .\d{4} \w+')

I tried (\+|-) but it creates an additional, invalid group. The . will match anything but this isn't too risky since the rest of the pattern has to work.

merll commented 7 years ago

I have created a separate issue for the Regex, while I am still tracking down the client creation.

ambsw-technology commented 7 years ago

The CLI client creation issue was created by my original fix for the docker-map check. I added an inheritance relationship with docker.Client to get past it. When I do that, it's actually docker.Client that injects docker+http://... and was creating the -H issue. This issue went away when I eliminated the inheritance from docker.Client (possible thanks to the different strategy for the docker-map check).