Azure / iotedge

The IoT Edge OSS project
MIT License
1.45k stars 457 forks source link

How to run aziot-edge in a docker container? #7258

Closed egil closed 2 months ago

egil commented 3 months ago

I am trying to create a docker image that I can use to run a IoT Edge Device in a docker container locally in my computer. But my Linux/docker skills are failing me, so I hope somebody here can help.

My dockerfile looks like this:

# Start from Ubuntu for `apt-get`
FROM ubuntu:22.04

RUN apt-get update -qq && apt-get install -qqy \
    apt-transport-https \
    ca-certificates \
    curl \
    wget \
    gnupg \
    lsb-release \
    jq \
    net-tools \
    iptables \
    iproute2 \
    systemd && \
    rm -rf /var/lib/apt/lists/*

# Step 1: Install Microsoft package repository
RUN wget https://packages.microsoft.com/config/ubuntu/22.04/packages-microsoft-prod.deb -O packages-microsoft-prod.deb && \
    dpkg -i packages-microsoft-prod.deb && \
    rm packages-microsoft-prod.deb

# Step 2: Install Moby engine and Azure CLI
RUN apt-get update && apt-get install -y moby-cli moby-engine

# Step 3: Configure Docker daemon
RUN echo '{ "log-driver": "local" }' > /etc/docker/daemon.json

# Step 4: Install Azure IoT Edge
RUN apt-get install -y aziot-edge

# Clean up to reduce image size
RUN apt-get clean && \
    rm -rf /var/lib/apt/lists/*

COPY edge-start.sh /usr/local/bin/
RUN chmod +x /usr/local/bin/edge-start.sh

VOLUME /var/lib/docker

EXPOSE 2375
EXPOSE 15580
EXPOSE 15581

ENTRYPOINT ["bash", "edge-start.sh"]
CMD []

The edge-start.sh file looks like this:

#!/bin/bash

startEdgeRuntime(){
echo "***Configuring and Starting IoT Edge Runtime***"

cat <<EOF > /etc/aziot/config.toml
[provisioning]
source = "manual"
connection_string = "$connectionString"

[agent]
name = "edgeAgent"
type = "docker"

[agent.config]
image = "mcr.microsoft.com/azureiotedge-agent:1.4"

[connect]
workload_uri = "unix:///var/run/iotedge/workload.sock"
management_uri = "unix:///var/run/iotedge/mgmt.sock"

[listen]
workload_uri = "fd://aziot-edged.workload.socket"
management_uri = "fd://aziot-edged.mgmt.socket"

[moby_runtime]
uri = "unix:///var/run/docker.sock"
network = "azure-iot-edge"
EOF

cat /etc/aziot/config.toml

iotedge config apply -c /etc/aziot/config.toml

}

echo "***Starting Docker in Docker***"

#remove docker.pid if it exists to allow Docker to restart if the container was previously stopped
if [ -f /var/run/docker.pid ]; then
    echo "Stale docker.pid found in /var/run/docker.pid, removing..."
    rm /var/run/docker.pid
fi

while (! docker stats --no-stream ); do
  # Docker takes a few seconds to initialize
  dockerd --host=unix:///var/run/docker.sock --host=tcp://0.0.0.0:2375 &
  echo "Waiting for Docker to launch..."
  sleep 1
done

if [ -z "$connectionString" ]; then
    echo "No connectionString provided."
else
    startEdgeRuntime
fi

To build the image and run it, I am using the following commands:

docker build -t iot-edge-device-docker .
docker run --privileged -d -v /var/run/docker.sock:/var/run/docker.sock -v /sys/fs/cgroup:/sys/fs/cgroup:ro -e connectionString='<REDACTED>' --name iot-edge-device iot-edge-device-docker

The output from running the container in my Docker Windows 4.28.0 on Windows 11 is as follows:

2024-04-05 17:15:26 ***Starting Docker in Docker***
2024-04-05 17:15:28 CONTAINER ID   NAME              CPU %     MEM USAGE / LIMIT    MEM %     NET I/O     BLOCK I/O   PIDS
2024-04-05 17:15:28 efc211c1a52a   iot-edge-device   0.00%     15.39MiB / 31.3GiB   0.05%     486B / 0B   0B / 0B     11
2024-04-05 17:15:28 ***Configuring and Starting IoT Edge Runtime***
2024-04-05 17:15:28 [provisioning]
2024-04-05 17:15:28 source = "manual"
2024-04-05 17:15:28 connection_string = "<REDACTED>"
2024-04-05 17:15:28 
2024-04-05 17:15:28 [agent]
2024-04-05 17:15:28 name = "edgeAgent"
2024-04-05 17:15:28 type = "docker"
2024-04-05 17:15:28 
2024-04-05 17:15:28 [agent.config]
2024-04-05 17:15:28 image = "mcr.microsoft.com/azureiotedge-agent:1.4"
2024-04-05 17:15:28 
2024-04-05 17:15:28 [connect]
2024-04-05 17:15:28 workload_uri = "unix:///var/run/iotedge/workload.sock"
2024-04-05 17:15:28 management_uri = "unix:///var/run/iotedge/mgmt.sock"
2024-04-05 17:15:28 
2024-04-05 17:15:28 [listen]
2024-04-05 17:15:28 workload_uri = "fd://aziot-edged.workload.socket"
2024-04-05 17:15:28 management_uri = "fd://aziot-edged.mgmt.socket"
2024-04-05 17:15:28 
2024-04-05 17:15:28 [moby_runtime]
2024-04-05 17:15:28 uri = "unix:///var/run/docker.sock"
2024-04-05 17:15:28 network = "azure-iot-edge"
2024-04-05 17:15:28 Warning: the previous identity config file is unreadable
2024-04-05 17:15:28 Note: Symmetric key will be written to /var/secrets/aziot/keyd/device-id
2024-04-05 17:15:28 Azure IoT Edge has been configured successfully!
2024-04-05 17:15:28 
2024-04-05 17:15:28 Restarting service for configuration to take effect...
2024-04-05 17:15:28 Stopping aziot-edged.service...System has not been booted with systemd as init system (PID 1). Can't operate.
2024-04-05 17:15:28 Failed to connect to bus: Host is down
2024-04-05 17:15:28 Stopping aziot-identityd.service...System has not been booted with systemd as init system (PID 1). Can't operate.
2024-04-05 17:15:28 Failed to connect to bus: Host is down
2024-04-05 17:15:28 Stopping aziot-keyd.service...System has not been booted with systemd as init system (PID 1). Can't operate.
2024-04-05 17:15:28 Failed to connect to bus: Host is down
2024-04-05 17:15:28 systemctl exited with non-zero status code.
2024-04-05 17:15:28 stdout:
2024-04-05 17:15:28 =======
2024-04-05 17:15:28 stderr:
2024-04-05 17:15:28 =======
2024-04-05 17:15:28 
2024-04-05 17:15:28 systemctl exited with non-zero status code.
2024-04-05 17:15:28 stdout:
2024-04-05 17:15:28 =======
2024-04-05 17:15:28 stderr:
2024-04-05 17:15:28 =======
2024-04-05 17:15:28 
2024-04-05 17:15:28 systemctl exited with non-zero status code.
2024-04-05 17:15:28 stdout:
2024-04-05 17:15:28 =======
2024-04-05 17:15:28 stderr:
2024-04-05 17:15:28 =======
2024-04-05 17:15:28 
2024-04-05 17:15:28 systemctl exited with non-zero status code.
2024-04-05 17:15:28 stdout:
2024-04-05 17:15:28 =======
2024-04-05 17:15:28 stderr:
2024-04-05 17:15:28 =======
2024-04-05 17:15:28 
2024-04-05 17:15:28 systemctl exited with non-zero status code.
2024-04-05 17:15:28 stdout:
2024-04-05 17:15:28 =======
2024-04-05 17:15:28 stderr:
2024-04-05 17:15:28 =======
2024-04-05 17:15:28 
2024-04-05 17:15:28 systemctl exited with non-zero status code.
2024-04-05 17:15:28 stdout:
2024-04-05 17:15:28 =======
2024-04-05 17:15:28 stderr:
2024-04-05 17:15:28 =======
2024-04-05 17:15:28 
2024-04-05 17:15:28 systemctl exited with non-zero status code.
2024-04-05 17:15:28 stdout:
2024-04-05 17:15:28 =======
2024-04-05 17:15:28 stderr:
2024-04-05 17:15:28 =======
2024-04-05 17:15:28 
2024-04-05 17:15:28 systemctl exited with non-zero status code.
2024-04-05 17:15:28 stdout:
2024-04-05 17:15:28 =======
2024-04-05 17:15:28 stderr:
2024-04-05 17:15:28 =======
2024-04-05 17:15:28 
2024-04-05 17:15:28 systemctl exited with non-zero status code.
2024-04-05 17:15:28 stdout:
2024-04-05 17:15:28 =======
2024-04-05 17:15:28 stderr:
2024-04-05 17:15:28 =======
2024-04-05 17:15:28 
2024-04-05 17:15:28 systemctl exited with non-zero status code.
2024-04-05 17:15:28 stdout:
2024-04-05 17:15:28 =======
2024-04-05 17:15:28 stderr:
2024-04-05 17:15:28 =======
2024-04-05 17:15:28 
2024-04-05 17:15:28 systemctl exited with non-zero status code.
2024-04-05 17:15:28 stdout:
2024-04-05 17:15:28 =======
2024-04-05 17:15:28 stderr:
2024-04-05 17:15:28 =======
2024-04-05 17:15:28 
2024-04-05 17:15:28 systemctl exited with non-zero status code.
2024-04-05 17:15:28 stdout:
2024-04-05 17:15:28 =======
2024-04-05 17:15:28 stderr:
2024-04-05 17:15:28 =======
2024-04-05 17:15:28 
2024-04-05 17:15:28 Stopping aziot-certd.service...System has not been booted with systemd as init system (PID 1). Can't operate.
2024-04-05 17:15:28 Failed to connect to bus: Host is down
2024-04-05 17:15:28 Stopping aziot-tpmd.service...System has not been booted with systemd as init system (PID 1). Can't operate.
2024-04-05 17:15:28 Failed to connect to bus: Host is down
2024-04-05 17:15:28 Starting aziot-edged.mgmt.socket...System has not been booted with systemd as init system (PID 1). Can't operate.
2024-04-05 17:15:28 Failed to connect to bus: Host is down
2024-04-05 17:15:28 Starting aziot-edged.workload.socket...System has not been booted with systemd as init system (PID 1). Can't operate.
2024-04-05 17:15:28 Failed to connect to bus: Host is down
2024-04-05 17:15:28 Starting aziot-identityd.socket...System has not been booted with systemd as init system (PID 1). Can't operate.
2024-04-05 17:15:28 Failed to connect to bus: Host is down
2024-04-05 17:15:28 Starting aziot-keyd.socket...System has not been booted with systemd as init system (PID 1). Can't operate.
2024-04-05 17:15:28 Failed to connect to bus: Host is down
2024-04-05 17:15:28 Starting aziot-certd.socket...System has not been booted with systemd as init system (PID 1). Can't operate.
2024-04-05 17:15:28 Failed to connect to bus: Host is down
2024-04-05 17:15:28 Starting aziot-tpmd.socket...System has not been booted with systemd as init system (PID 1). Can't operate.
2024-04-05 17:15:28 Failed to connect to bus: Host is down
2024-04-05 17:15:28 Starting aziot-edged.service...System has not been booted with systemd as init system (PID 1). Can't operate.
2024-04-05 17:15:28 Failed to connect to bus: Host is down
2024-04-05 17:15:28 Done.

What changes do I need to make to my dockerfile, edge-start.sh, or config.toml to get this working?

vadim-kovalyov commented 3 months ago

Hey @egil, this is not a use case that we can support. IoT Edge runs as a set of host services, and needs systemd to run. And most likely you won't be able to achieve this, especially if your host does not have systemd. You can try to run the container on Ubuntu or to install systemd within the container itself with dockerfile, but IDK if that will work. The simplest way is to create a VM image instead.

I quickly searched online and found something similar - https://askubuntu.com/questions/1471990/add-systemd-to-boot-of-a-docker-container

vadim-kovalyov commented 3 months ago

Apparently, there was an old issue about that as well - so take a look - https://github.com/Azure/iotedge/issues/7161

egil commented 3 months ago

Hey @vadim-kovalyov, yes, I know it's not directly supported. I've seen other examples that did this for old versions of the edge device software, and its certainly preferred to having to run a VM in the cloud or locally.

I did manage to get it running and the edge device is able to connect to my IoT Hub, but only the agent is getting deployed. There are some warnings if I do an iotedge check. If you don't mind taking a look and let me know if these are deal breakers:

root@cb7a0509e394:/# iotedge check --verbose

Configuration checks (aziot-identity-service)
---------------------------------------------
√ keyd configuration is well-formed - OK
√ certd configuration is well-formed - OK
√ tpmd configuration is well-formed - OK
√ identityd configuration is well-formed - OK
√ daemon configurations up-to-date with config.toml - OK
√ identityd config toml file specifies a valid hostname - OK
√ aziot-identity-service package is up-to-date - OK
√ host time is close to reference time - OK
√ preloaded certificates are valid - OK
√ keyd is running - OK
√ certd is running - OK
√ identityd is running - OK
√ read all preloaded certificates from the Certificates Service - OK
√ read all preloaded key pairs from the Keys Service - OK
√ check all EST server URLs utilize HTTPS - OK
√ ensure all preloaded certificates match preloaded private keys with the same ID - OK

Connectivity checks (aziot-identity-service)
--------------------------------------------
√ host can connect to and perform TLS handshake with iothub AMQP port - OK
√ host can connect to and perform TLS handshake with iothub HTTPS / WebSockets port - OK
√ host can connect to and perform TLS handshake with iothub MQTT port - OK

Configuration checks
--------------------
√ aziot-edged configuration is well-formed - OK
√ configuration up-to-date with config.toml - OK
√ container engine is installed and functional - OK
√ configuration has correct URIs for daemon mgmt endpoint - OK
√ aziot-edge package is up-to-date - OK
√ container time is close to host time - OK
‼ DNS server - Warning
    Container engine is not configured with DNS server setting, which may impact connectivity to IoT Hub.
    Please see https://aka.ms/iotedge-prod-checklist-dns for best practices.
    You can ignore this warning if you are setting DNS server per module in the Edge deployment.
        caused by: Container engine is not configured with DNS server setting, which may impact connectivity to IoT Hub.
                   Please see https://aka.ms/iotedge-prod-checklist-dns for best practices.
                   You can ignore this warning if you are setting DNS server per module in the Edge deployment.
‼ production readiness: logs policy - Warning
    Container engine is not configured to rotate module logs which may cause it run out of disk space.
    Please see https://aka.ms/iotedge-prod-checklist-logs for best practices.
    You can ignore this warning if you are setting log policy per module in the Edge deployment.
        caused by: Container engine is not configured to rotate module logs which may cause it run out of disk space.
                   Please see https://aka.ms/iotedge-prod-checklist-logs for best practices.
                   You can ignore this warning if you are setting log policy per module in the Edge deployment.
‼ production readiness: Edge Agent's storage directory is persisted on the host filesystem - Warning
    The edgeAgent module is not configured to persist its /tmp/edgeAgent directory on the host filesystem.
    Data might be lost if the module is deleted or updated.
    Please see https://aka.ms/iotedge-storage-host for best practices.
        caused by: The edgeAgent module is not configured to persist its /tmp/edgeAgent directory on the host filesystem.
                   Data might be lost if the module is deleted or updated.
                   Please see https://aka.ms/iotedge-storage-host for best practices.
× production readiness: Edge Hub's storage directory is persisted on the host filesystem - Error
    Could not check current state of edgeHub container
        caused by: Could not check current state of edgeHub container
        caused by: docker returned exit status: 1, stderr = Error: No such object: edgeHub
√ Agent image is valid and can be pulled from upstream - OK
√ proxy settings are consistent in aziot-edged, aziot-identityd, moby daemon and config.toml - OK

Connectivity checks
-------------------
√ container on the default network can connect to upstream AMQP port - OK
√ container on the default network can connect to upstream HTTPS / WebSockets port - OK
√ container on the default network can connect to upstream MQTT port - OK
    skipping because of not required in this configuration
√ container on the IoT Edge module network can connect to upstream AMQP port - OK
√ container on the IoT Edge module network can connect to upstream HTTPS / WebSockets port - OK
√ container on the IoT Edge module network can connect to upstream MQTT port - OK
    skipping because of not required in this configuration
31 check(s) succeeded.
3 check(s) raised warnings.
1 check(s) raised errors.
2 check(s) were skipped due to errors from other checks.

Here is the command used to build and run the docker container:

docker build -t iot-edge-device-docker .
docker run --privileged -it --rm -v /var/run/docker.sock:/var/run/docker.sock -v /sys/fs/cgroup:/sys/fs/cgroup:rw -v C:\temp\iotedge:/iotedge/storage -e connectionString='<INSERT EDGE DEVICE CONNECTION STRING>' --name iot-edge-device iot-edge-device-docker --hostname=egh-edge-device --dns 8.8.8.8 --dns 8.8.4.4

Here are the files used to set up the docker container:

Dockerfile

# Start from Ubuntu for `apt-get`
FROM ubuntu:22.04

RUN apt-get update -qq && apt-get install -qqy \
    apt-transport-https \
    ca-certificates \
    curl \
    wget \
    gnupg \
    lsb-release \
    jq \
    net-tools \
    iptables \
    iproute2 \
    systemd && \
    rm -rf /var/lib/apt/lists/*

# Cleanup to enable systemd and systemctl commands
RUN (cd /lib/systemd/system/sysinit.target.wants/; for i in *; do [ $i == systemd-tmpfiles-setup.service ] || rm -f $i; done); \
    rm -f /lib/systemd/system/multi-user.target.wants/*;\
    rm -f /etc/systemd/system/*.wants/*;\
    rm -f /lib/systemd/system/local-fs.target.wants/*; \
    rm -f /lib/systemd/system/sockets.target.wants/*udev*; \
    rm -f /lib/systemd/system/sockets.target.wants/*initctl*; \
    rm -f /lib/systemd/system/basic.target.wants/*;\
    rm -f /lib/systemd/system/anaconda.target.wants/*;

# Step 1: Install Microsoft package repository
RUN wget https://packages.microsoft.com/config/ubuntu/22.04/packages-microsoft-prod.deb -O packages-microsoft-prod.deb && \
    dpkg -i packages-microsoft-prod.deb && \
    rm packages-microsoft-prod.deb

# Step 2: Install Moby engine and Azure CLI
RUN apt-get update && apt-get install -y moby-cli moby-engine

# Step 3: Configure Docker daemon
RUN echo '{ "log-driver": "local" }' > /etc/docker/daemon.json

# Step 4: Install Azure IoT Edge
RUN apt-get install -y aziot-edge

# Clean up to reduce image size
RUN apt-get clean && \
    rm -rf /var/lib/apt/lists/*

# Allow services to start
VOLUME [ "/sys/fs/cgroup", "/var/lib/docker" ]

EXPOSE 2375
EXPOSE 15580
EXPOSE 15581

COPY docker-entrypoint.sh /usr/local/bin/docker-entrypoint.sh
RUN chmod +x /usr/local/bin/docker-entrypoint.sh
COPY iotedge-init.service /etc/systemd/system/iotedge-init.service
COPY edge-init.sh /usr/local/bin/edge-init.sh
RUN chmod +x /usr/local/bin/edge-init.sh
RUN systemctl enable iotedge-init.service

ENTRYPOINT ["/usr/local/bin/docker-entrypoint.sh"]

docker-entrypoint.sh

#!/bin/bash
echo "***Configuring IoT Edge Runtime***"
cat <<EOF > /etc/aziot/config.toml
[provisioning]
source = "manual"
connection_string = "$connectionString"

[agent]
name = "edgeAgent"
type = "docker"

[agent.config]
image = "mcr.microsoft.com/azureiotedge-agent:1.4"
createOptions = { HostConfig = { Binds = ["/iotedge/storage:/iotedge/storage"] } }

[connect]
workload_uri = "unix:///var/run/iotedge/workload.sock"
management_uri = "unix:///var/run/iotedge/mgmt.sock"

[listen]
workload_uri = "fd://aziot-edged.workload.socket"
management_uri = "fd://aziot-edged.mgmt.socket"

[moby_runtime]
uri = "unix:///var/run/docker.sock"
network = "azure-iot-edge"
EOF
mkdir -p /iotedge/storage

echo "***Starting systemd***"
exec /lib/systemd/systemd --log-level=info

iotedge-init.service

[Unit]
Description=Initialize IoT Edge Runtime
After=docker.service
Requires=docker.service

[Service]
Type=oneshot
ExecStart=/usr/local/bin/edge-init.sh
RemainAfterExit=yes

[Install]
WantedBy=multi-user.target

edge-init.sh

#!/bin/bash

echo "***Starting Docker in Docker***"

# Updated approach to start Docker
# Check for the presence of docker daemon
pgrep dockerd
if [ $? -ne 0 ]; then
    dockerd &
else
    echo "Docker daemon already running."
fi

# Wait for Docker to initialize
while (! docker stats --no-stream ); do
  echo "Waiting for Docker to launch..."
  sleep 1
done

iotedge config apply -c /etc/aziot/config.toml
vadim-kovalyov commented 3 months ago

Hey @egil, glad to see you have the service starting in the container. To answer the question about EdgeAgent I would need to take a look at EdgeAgent logs. Double check that you have a deployment that targets the device. EdgeHub will not start if there has been no deployment applied to the device.

Another possibility is that EdgeAgent can't connect to the IoT Hub. This is unlikely, because connectivity checks have passed when you ran check command. But if it is indeed connectivity problems - you'd need to look into that.

muelleth commented 3 months ago

Guys, I have a proof of concept working as of today (maybe kind of a more edgy approach though)... however, it works for testing on my Mac M3 Pro

Relevant additions in Dockerfile

FROM arm64v8/ubuntu:22.04

RUN apt-get -y update && \
    apt-get remove docker docker.io containerd runc && \
    apt-get -y install --no-install-recommends iputils-ping curl net-tools software-properties-common iproute2 

RUN add-apt-repository "deb [arch=arm64] https://download.docker.com/linux/ubuntu jammy stable"; exit 0
RUN apt-get -y update --allow-insecure-repositories
RUN apt-get -y install --allow-unauthenticated --no-install-recommends \
    docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

(yes, I know this is a hacky way to install)

Configuration of Azure Iot-Hub

More comments

Thank you very much for this issue. Your snippets above were super useful!! ❤️
Any feedback welcome!

Cheers, Thomas

egil commented 3 months ago

Awesome @muelleth. I am using Docker for Windows, will this also work for me? Do prefer to not have privilege mode enabled (I am a docker newbie).

@vadim-kovalyov thanks, I am also an iot hub newbie, so I am probably just missing the "edge deployment" @muelleth described above, so that is probably why the edgeClient is not getting pushed to my docker container.

egil commented 2 months ago

@vadim-kovalyov I tried creating an deployment using @muelleth deployment file as a template and assigned it to my edge device. It failed to deploy the three images though. Can you guide me to where I need to look for logs you need to help me debug?

Appreciate all the help with this!

muelleth commented 2 months ago

@egil remember to put your container registry settings into the json :-)

HTH, THomas

egil commented 2 months ago

It did indeed help. Got it running with my dockerfile above. Thanks. Now to develop a few modules 😊

egil commented 2 months ago

I'll close this issue and leave folks with a link to my docker image: