thin-edge / thin-edge.io

The open edge framework for lightweight IoT devices
https://thin-edge.io
Apache License 2.0
219 stars 54 forks source link

device profile is not usable after a thin-edge.io update unless tedge-agent is restarted by the user #3088

Open reubenmiller opened 2 weeks ago

reubenmiller commented 2 weeks ago

Describe the bug

The device_profile operation is not registered when upgrading from a previous thin-edge.io version that does not support the device_profile operation.

The main reason is that the tedge-agent only reads the list of workflows on startup, and during an upgrade, the tedge-agent service is not restarted (as restarting the service in the post installation script would cause other problems whilst processing operations).

To Reproduce

  1. Install last official release (thin-edge.io 1.2.0) (e.g. a release prior to the device profile feature being released)

    wget -O - thin-edge.io/install.sh | sh -s
  2. Install the latest version from the main channel (which now includes the device profile feature)

    wget -O - thin-edge.io/install.sh | sh -s -- --channel main
  3. Check if the device_profile has been registered on the local MQTT broker

    tedge mqtt sub te/device/main///cmd/device_profile

    The expected outcome is to have see the following retained MQTT message on the local MQTT broker which indicates that the tedge-agent will react to any device_profile operation sent to it.

    [te/device/main///cmd/device_profile] {}

Expected behavior

When thin-edge.io is upgraded from a version which does not support the device_profile operation to a version that does support it, then the user should not have to manually restart the tedge-agent service before they can use the feature.

Screenshots

Environment (please complete the following information):

Property Value
OS [incl. version] Debian GNU/Linux 12 (bookworm)
Hardware [incl. revision] Raspberry Pi 5 Model B Rev 1.0
System-Architecture Linux rpi5-d83add9f145a 6.1.0-rpi7-rpi-2712 #1 SMP PREEMPT Debian 1:6.1.63-1+rpt1 (2023-11-24) aarch64 GNU/Linux
thin-edge.io version tedge 1.2.1~74+gd1de960

Additional context

reubenmiller commented 2 weeks ago

This problem is generally limited to users which are updating thin-edge.io using the install.sh script, or manually doing a package update (e.g. apt-get install -y tedge-full).

Though it might be a good time to explore the reloading of workflows during runtime (though it still needs some careful consideration as reloading a workflow in the middle of processing a workflow might be harmful/unpredictable).

reubenmiller commented 1 week ago

After building a custom rugpi image using tedge-rugpi-image and using the thin-edge.io main channel, there is definitely an initialization problem when it comes to the creation of the device_profile.toml workflow on first boot.

If the tedge-agent is started and the device_profile.toml does not exist, then tedge-agent will create it, however it does not load the workflow into memory, which means that the tedge-agent does not know about the "new" device_profile workflow until after the process is restarted again.

The error can be easily reproduced using the following procedure:

  1. Stop the tedge-agent service

    systemctl stop tedge-agent
  2. Delete the device_profile workflow

    rm -f /etc/tedge/operations/device_profile.toml
  3. Start the tedge-agent service

    systemctl start tedge-agent
  4. Check if the device_profile

    Verify the creation of the device_profile.toml:

    $ ls -l /etc/tedge/operations/device_profile.toml
    -rw-r--r-- 1 tedge tedge 812 Sep  4 03:55 /etc/tedge/operations/device_profile.toml

    In addition, it is expected that the device_profile supported operation is also created

    tedge mqtt sub te/device/main///cmd/device_profile

    Restarting the tedge-agent again, will result in the correct device_profile supported operation MQTT message from being published (see below for the expected MQTT message).

    $ tedge mqtt sub te/device/main///cmd/device_profile
    INFO: Connected
    [te/device/main///cmd/device_profile] {}