ansible / ansible

Ansible is a radically simple IT automation platform that makes your applications and systems easier to deploy and maintain. Automate everything from code deployment to network configuration to cloud management, in a language that approaches plain English, using SSH, with no agents to install on remote systems. https://docs.ansible.com.
https://www.ansible.com/
GNU General Public License v3.0
63.11k stars 23.93k forks source link

docker_swarm_service: bad default values? #53223

Closed jvalrog closed 5 years ago

jvalrog commented 5 years ago

Hi, I'm not a developer so sorry if this bug report is not good enough.

SUMMARY

I'm getting an error about bad values on a configuration option I've never used.

I think it may be related to this other bug?: https://github.com/ansible/ansible/pull/51216

ISSUE TYPE
COMPONENT NAME
ANSIBLE VERSION
ansible 2.7.8
  config file = /home/tigre/files/code/ansible/pi-cluster/ansible.cfg
  configured module search path = [u'/home/tigre/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python2.7/dist-packages/ansible
  executable location = /usr/bin/ansible
  python version = 2.7.15rc1 (default, Nov 12 2018, 14:31:15) [GCC 7.3.0]
CONFIGURATION
      - name: Destroy the service
        docker_swarm_service:
          name: elegant_franklin
          state: absent
OS / ENVIRONMENT
EXPECTED RESULTS
ACTUAL RESULTS
fatal: [pimaster.local]: FAILED! => {"changed": false, "msg": "argument update_monitor is of type <type 'long'> and we were unable to convert to int: <type 'long'> cannot be converted to an int"}

Also, if I fix that issue by using a good value on that parameter "update_monitor: 100", it can't still find the service by its name:

fatal: [pimaster.local]: FAILED! => {"changed": false, "msg": "Error looking for service named elegant_franklin: 'UpdateConfig'"}

While it actually exists:

pi@pimaster:~ $ docker service ls
ID                  NAME                MODE                REPLICAS            IMAGE                  PORTS
qagxsqkkpb6j        elegant_franklin    replicated          1/1                 arm32v6/nginx:alpine   

The docker version running in my raspberry pi 3 is:

Docker version 18.06.3-ce, build d7080c1
ansibot commented 5 years ago

Files identified in the description:

If these files are inaccurate, please update the component name section of the description or use the !component bot command.

click here for bot help

ansibot commented 5 years ago

cc @DBendit @WojciechowskiPiotr @akshay196 @danihodovic @dariko @felixfontein @hannseman @jwitko @kassiansun @tbouvet click here for bot help

felixfontein commented 5 years ago

Thanks for reporting this! It should not be related to that PR you mentioned (#51216), since it has not been backported to stable-2.7, so you won't see it in effect until Ansible 2.8 (or when you work with the devel branch somehow). It's good that you mention your raspberry pi: is it the case that you execute ansible-playbook on the Ubuntu 18.04 machine, but the module is executed on the raspberry pi? If yes, please tell us which Python you are using on that machine, because the error is happening in the code running on the destination host.

The distinction between int and long only happens for Python 2.x, and usually only affects large enough numbers (on my machine, 10000000000000000000 is a long, while 1000000000000000000 is an int with Python 2.7 -- which is not surprising, since the former is larger than 2^63, while the latter isn't, so the former cannot be stored in a signed 64bit integer). The default for update_monitor is 5000000000 in Ansible 2.7.x, which is definitely an int on my machine, whence this problem won't happen for me. It could be that for Python on the raspberry pi, the cut-off for int is closer and 5000000000 happens to be a long (because the pi has a 32bit CPU if I'm remembering correctly, and that number is slightly above 2^32). In that case, I'm not surprised that this error happens (though it is annoying). It is a general Ansible code error, though, and not specific to this module (only that this module happens to have a large enough default value which triggers this bug).

The second error is probably related to this line: https://github.com/ansible/ansible/blob/stable-2.7/lib/ansible/modules/cloud/docker/docker_swarm_service.py#L900 The Spec output probably doesn't contain UpdateConfig, whence this line triggers a KeyError("UpdateConfig"), which is converted by https://github.com/ansible/ansible/blob/stable-2.7/lib/ansible/modules/cloud/docker/docker_swarm_service.py#L1069-L1072 to the error you're seeing. The docker version on your raspberry doesn't look that old, so I'm somewhat surprised its output doesn't contain this. @hannseman any idea? Anyway, this won't happen in Ansible 2.8 anymore. I'll create a PR directly for stable-2.7 to use get(..., dict()) instead of [...], which should fix this.

felixfontein commented 5 years ago

The PR for the second problem is #53224, and I've created a new issue for the first problem: #53225

jvalrog commented 5 years ago

The Python version on the Pi is: 2.7.13

The python docker module (installed with pip) in the Pi is: docker (3.7.0)

On my PC, the python version is: 2.7.15rc1

felixfontein commented 5 years ago

Thanks for checking this out! I think the problem with the default value is #53225 combined with Python 2.x on a 32bit CPU. Can you check whether your Raspberry Pi is indeed a 32bit version, or at least the OS running on it is 32bit?

hannseman commented 5 years ago

Yes that KeyError is weird indeed as you’re running a very recent docker version.

@jvalrog What version of the docker python library are you running? Can be checked with pip freeze | grep docker.

hannseman commented 5 years ago

Sorry I missed your previous comment about the docker python version.

felixfontein commented 5 years ago

It's really strange that it is missing. Maybe it is caused by docker for Raspberry Pi being some down-scaled version which removed some stuff? After all, it's 32bit and ARM, so maybe some stuff is simply disabled. I have no idea though, I'm just guessing :)

jvalrog commented 5 years ago

Yes @felixfontein, it's 32 bit (armv7l).

jvalrog commented 5 years ago

I've tried starting a service and I get more errors.

    - name: Start the service
      docker_swarm_service:
        name: "mysql-server"
        image: "hypriot/rpi-mysql"
        state: present
        env:
          - MYSQL_ROOT_PASSWORD: "password"
        publish:
          - 3306:3306
        update_monitor: 100

And the full error:

TASK [Start the service] **************************************************************************************************************
fatal: [pimaster.local]: FAILED! => {
    "changed": false, 
    "rc": 1
}

MSG:

MODULE FAILURE
See stdout/stderr for the exact error

MODULE_STDOUT:

Traceback (most recent call last):
  File "/home/pi/.ansible/tmp/ansible-tmp-1551643371.5-231290595783156/AnsiballZ_docker_swarm_service.py", line 113, in <module>
    _ansiballz_main()
  File "/home/pi/.ansible/tmp/ansible-tmp-1551643371.5-231290595783156/AnsiballZ_docker_swarm_service.py", line 105, in _ansiballz_main
    invoke_module(zipped_mod, temp_path, ANSIBALLZ_PARAMS)
  File "/home/pi/.ansible/tmp/ansible-tmp-1551643371.5-231290595783156/AnsiballZ_docker_swarm_service.py", line 48, in invoke_module
    imp.load_module('__main__', mod, module, MOD_DESC)
  File "/tmp/ansible_docker_swarm_service_payload_k5EaMI/__main__.py", line 1195, in <module>
  File "/tmp/ansible_docker_swarm_service_payload_k5EaMI/__main__.py", line 1189, in main
  File "/tmp/ansible_docker_swarm_service_payload_k5EaMI/__main__.py", line 1064, in run
  File "/tmp/ansible_docker_swarm_service_payload_k5EaMI/__main__.py", line 1057, in test_parameter_versions
AttributeError: 'str' object has no attribute 'keys'

MODULE_STDERR:

Shared connection to pimaster.local closed.

I hope this helps.

felixfontein commented 5 years ago

You can "fix" this (temporarily) by adding publish: [] to the module options.

jvalrog commented 5 years ago

I still get errors:

    - name: Start the service
      docker_swarm_service:
        name: "mysql-server"
        image: "hypriot/rpi-mysql"
        state: present
        env:
          - MYSQL_ROOT_PASSWORD: "mypass"
        publish: []
        update_monitor: 100

Output:

TASK [Start the service] ********************************************************************************************************************************************
fatal: [pimaster.local]: FAILED! => {
    "changed": false, 
    "rc": 1
}

MSG:

MODULE FAILURE
See stdout/stderr for the exact error

MODULE_STDOUT:

Traceback (most recent call last):
  File "/home/pi/.ansible/tmp/ansible-tmp-1551648624.89-12106510167475/AnsiballZ_docker_swarm_service.py", line 113, in <module>
    _ansiballz_main()
  File "/home/pi/.ansible/tmp/ansible-tmp-1551648624.89-12106510167475/AnsiballZ_docker_swarm_service.py", line 105, in _ansiballz_main
    invoke_module(zipped_mod, temp_path, ANSIBALLZ_PARAMS)
  File "/home/pi/.ansible/tmp/ansible-tmp-1551648624.89-12106510167475/AnsiballZ_docker_swarm_service.py", line 48, in invoke_module
    imp.load_module('__main__', mod, module, MOD_DESC)
  File "/tmp/ansible_docker_swarm_service_payload_kosDeu/__main__.py", line 1195, in <module>
  File "/tmp/ansible_docker_swarm_service_payload_kosDeu/__main__.py", line 1189, in main
  File "/tmp/ansible_docker_swarm_service_payload_kosDeu/__main__.py", line 1129, in run
  File "/tmp/ansible_docker_swarm_service_payload_kosDeu/__main__.py", line 1029, in create_service
  File "/usr/local/lib/python2.7/dist-packages/docker/utils/decorators.py", line 34, in wrapper
    return f(self, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/docker/api/service.py", line 185, in create_service
    self._post_json(url, data=data, headers=headers), True
  File "/usr/local/lib/python2.7/dist-packages/docker/api/client.py", line 262, in _result
    self._raise_for_status(response)
  File "/usr/local/lib/python2.7/dist-packages/docker/api/client.py", line 258, in _raise_for_status
    raise create_api_error_from_http_exception(e)
  File "/usr/local/lib/python2.7/dist-packages/docker/errors.py", line 31, in create_api_error_from_http_exception
    raise cls(e, response=response, explanation=explanation)
docker.errors.APIError: 500 Server Error: Internal Server Error ("json: cannot unmarshal object into Go struct field ContainerSpec.Env of type string")

MODULE_STDERR:

Shared connection to pimaster.local closed.
hannseman commented 5 years ago

The problem here is that you define env as list of dictionaries. In 2.7 it can only be a list of strings, this is resolved in 2.8 but until then you'll have to define env as this:

env:
  - "MYSQL_ROOT_PASSWORD=mypass"
felixfontein commented 5 years ago

I've created a PR for the publish problem: #53262

hannseman commented 5 years ago

Regarding the publish problem you see that error because you defined it as a list of strings. It expects a list of dictionaries.

publish:
  - published_port: 3306
    target_port: 3306
jvalrog commented 5 years ago

That was my fault, I couldn't find any examples about the publish option.

Now it works, I just have to keep the update_monitor option in a reasonable value.

Thanks!

hannseman commented 5 years ago

@jvalrog good! Yes the documentation for docker_swarm_service in Ansible 2.7 is not really rock-solid but it'll be improved in Ansible 2.8.

felixfontein commented 5 years ago

I'll start working on that int/long bug later today, maybe we can get a fix for that as well into the next 2.7.x release (should be released by end of next week), let's see...

felixfontein commented 5 years ago

BTW, you can also pass 5000000000 as a string to update_monitor, that should work:

docker_swarm_service:
  update_monitor: "5000000000"

Then your service doesn't get hammered to death by monitoring :)

felixfontein commented 5 years ago

I think all problems in this issue are fixed in Ansible 2.7.9, which will probably be released in a week. I'm closing this issue; @jvalrog, in case you think we forgot an aspect, please ping me. If you find something new, please create a new issue.

close_me