saltstack-formulas / docker-formula

Install and set up Docker
http://docs.saltstack.com/en/latest/topics/development/conventions/formulas.html
Other
136 stars 330 forks source link

[BUG] segmentation fault from dockerd on fresh Pi 4 #286

Closed auphofBSF closed 3 years ago

auphofBSF commented 3 years ago

Your setup

PI4 with latest raspbian 2021-03-04: https://downloads.raspberrypi.org/raspios_lite_armhf/release_notes.txt

uname -a -> Linux raspberrypi 5.10.17-v7l+ #1403 SMP Mon Feb 22 11:33:35 GMT 2021 armv7l GNU/Linux

Formula commit hash / release tag

commit e44383834a42a9f7fed0910b68efe48b6b45f509 (HEAD -> master, tag: v2.0.3, origin/master, origin/HEAD)

Versions reports (master & minion)

I am using salt_ssh with a roster file so only master is relevant

Salt Version:
          Salt: 3002.6

Dependency Versions:
          cffi: 1.14.3
      cherrypy: unknown
      dateutil: 2.8.1
     docker-py: Not Installed
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 2.10.1
       libgit2: Not Installed
      M2Crypto: Not Installed
          Mako: Not Installed
       msgpack: 1.0.0
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     pycparser: 2.17
      pycrypto: Not Installed
  pycryptodome: 3.9.8
        pygit2: Not Installed
        Python: 3.7.10 (default, Apr  2 2021, 22:51:43)
  python-gnupg: 0.4.4
        PyYAML: 5.3.1
         PyZMQ: 18.0.1
         smmap: Not Installed
       timelib: 0.2.4
       Tornado: 4.5.3
           ZMQ: 4.3.1

System Versions:
          dist: alpine 3.13.4
        locale: UTF-8
       machine: x86_64
       release: 4.19.128-microsoft-standard
        system: Linux
       version: Alpine Linux 3.13.4

Pillar / config used

None


Bug details

Describe the bug

on first run of salt-ssh 'test_pi4' state.apply TEST_pi4_docker_install test=False with logging I observe

[TRACE   ] STDERR 10.1.1.115
command-line line 0: Unsupported option "gssapiauthentication"
SALT_ARGV: ['/usr/bin/python3', '/var/tmp/.pi_544905_salt/salt-call', '--retcode-passthrough', '--local', '--metadata', '--out', 'json', '-l', 'quiet', '-c', '/var/tmp/.pi_544905_salt', '--', 'state.pkg', '/var/tmp/.pi_544905_salt/salt_state.tgz', 'test=False', 'pkg_sum=606c3b229b1d30fba526aab88a77c609a02f8b5ad5d57a553ecdd5d2df1e54e6', 'hash_type=sha256']
_edbc7885e4f9aac9b83b35999b68d015148caf467b78fa39c05f669c0ff89878
Connection to 10.1.1.115 closed by remote host.

[DEBUG   ] RETCODE 10.1.1.115: 255
[ERROR   ] JSON Render failed for:
Connection to 10.1.1.115 closed by remote host.
[ERROR   ] Expecting value: line 1 column 1 (char 0)
[DEBUG   ] LazyLoaded nested.output
[TRACE   ] data = {'test_pi4': ''}
test_pi4:
[TRACE   ] IPCClient: Connecting to socket: /var/run/salt/master/master_event_pull.ipc
[DEBUG   ] Sending event: tag = salt/job/20210425191015867100/ret/test_pi4; data = {'return': '', 'id': 'test_pi4', 'fun': 'state.apply', 'jid': '20210425191015867100', '_stamp': '2021-04-25T19:15:55.363490'}
[DEBUG   ] Closing IPCMessageClient instance

repeat the same salt_ssh command 2 tasks shown failed

----------
          ID: docker-software-service-running-docker
    Function: service.running
        Name: docker
      Result: False
     Comment: Service docker is already enabled, and is dead
     Started: 20:19:33.052122
    Duration: 252.685 ms
     Changes:
----------
          ID: docker-software-service-running-docker-fail-notify
    Function: test.fail_without_changes
      Result: False
     Comment: Formula is trying to start 'docker' service
              but failed, is it a correct name for Docker service in your OS?

              In certain circumstances the docker service will not start.
              Your kernel is missing some modules, or not in ideal state.
              See https://github.com/moby/moby/blob/master/contrib/check-config.sh
              * Rebooting your host is recommended!
     Started: 20:19:33.307348
    Duration: 2.35 ms
     Changes:

docker.service fails to run. Inspection of failure

systemctl status docker.service
● docker.service - docker service
   Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
   Active: failed (Result: signal) since Sun 2021-04-25 20:19:33 BST; 1min 42s ago
     Docs: https://docs.docker.com
  Process: 3670 ExecStart=/usr/local/docker-19.03.9/bin//dockerd (code=killed, signal=SEGV)
 Main PID: 3670 (code=killed, signal=SEGV)

Apr 25 20:19:33 raspberrypi systemd[1]: Started docker service.
Apr 25 20:19:33 raspberrypi systemd[1]: docker.service: Main process exited, code=killed, status=11/SEGV
Apr 25 20:19:33 raspberrypi systemd[1]: docker.service: Failed with result 'signal'.

and attempts to run dockerd manually on pi4 bash shows Segmentation fault

Steps to reproduce the bug

roster (sensitive details redacted)

test_pi4:
  host: -.-.-.115
  passwd: -----------
  port: 22
  sudo: true
  ssh_options: ['StrictHostKeyChecking=no']
  user: pi

TEST_pi4_docker_install.sls

# -*- coding: utf-8 -*-
# vim: ft=yaml
---
include:
    - docker
    # - docker.clean

install docker attempt on rostered device salt-ssh 'test_pi4' state.apply TEST_pi4_docker_install test=False

Expected behaviour

dockerd should be running

Attempts to fix the bug

Trace of install attempt show the archive installed as (Excerpt of Logfile with -l trace)

docker-software-docker-archive-install:
  pkg.installed:
    - names: ["python3-apt", "python3-pip", "python3-docker", "apt-transport-https", "ca-certificates", "curl", "gnupg-agent", "software-properties-common", "iptables", "git", "procps"]
    - reload_modules: True
    - require_in:
      - file: docker-software-docker-archive-install
  file.directory:
    - name: /usr/local/docker-19.03.9/bin/
    - makedirs: True
    - clean: False
    - require_in:
      - archive: docker-software-docker-archive-install
    - mode: 755
    - user: root
    - group: root
    - recurse:
        - user
        - group
        - mode
  archive.extracted:
    - unless: test -x /usr/local/docker-19.03.9/bin//docker
    - name: /usr/local/docker-19.03.9/bin/
    - options: --strip-components=1
    - source: https://download.docker.com/linux/static/stable/armhf/docker-19.03.9.tgz
    - source_hash: 5e757cf65d99b0326f49cabbfc3b9a65151cb569f04fcb64a7a0c7424772c7cf
    - retry: {"attempts": 3, "interval": 60, "splay": 10, "until": true}
    - enforce_toplevel: false
    - trim_output: true
    - user: root
    - group: root
    - recurse:
        - user
        - group
    - require:
      - file: docker-software-docker-archive-install

in reviewing the docker install instructions for raspbian specific note is made to not use the general install methods:

Raspbian users cannot use this method! For Raspbian, installing using the repository is not yet supported. You must instead use the convenience script. From https://docs.docker.com/engine/install/debian/#install-using-the-convenience-script

doing a dry run of the script reveals the package being installed as deb [arch=armhf] https://download.docker.com/linux/raspbian buster stable from

sudo sh get-docker.sh --dry-run

# Executing docker install script, commit: 7cae5f8b0decc17d6571f9f52eb840fbc13b2737
apt-get update -qq >/dev/null
DEBIAN_FRONTEND=noninteractive apt-get install -y -qq apt-transport-https ca-certificates curl >/dev/null
curl -fsSL "https://download.docker.com/linux/raspbian/gpg" | apt-key add -qq - >/dev/null
echo "deb [arch=armhf] https://download.docker.com/linux/raspbian buster stable" > /etc/apt/sources.list.d/docker.list
apt-get update -qq >/dev/null
apt-get install -y -qq --no-install-recommends docker-ce >/dev/null
DEBIAN_FRONTEND=noninteractive apt-get install -y -qq docker-ce-rootless-extras >/dev/null

Additional context

I note that #264 raised the exact same issue but did not mention in close that the latest commits leading to 2,0.3 where successful. I have made some changes to the docker-formula and have raised a WIP PULL request #287 . I am very unfamiliar with SALT and its architecture but learning fast and I am sure the key developers will be able to guide as appropriate to resolve this as a general solution for raspbian

noelmcloughlin commented 3 years ago

287