Closed bjoernsauer closed 1 year ago
Thanks @bjoernsauer for the details. We will see if we can reproduce it, otherwise we will reach out for some more details.
@bjoernsauer After the reboot, could you check if the mosquitto.service
started properly?
If you are using systemctl, you can check this via:
sudo systemctl status mosquitto.service
And it should return something like this
$ sudo systemctl status mosquitto.service
● mosquitto.service - Mosquitto MQTT Broker
Loaded: loaded (/lib/systemd/system/mosquitto.service; enabled; vendor preset: enabled)
Active: active (running) since Sun 2022-11-13 23:17:28 CET; 2 days ago
Docs: man:mosquitto.conf(5)
man:mosquitto(8)
Process: 2003 ExecReload=/bin/kill -HUP $MAINPID (code=exited, status=0/SUCCESS)
Main PID: 546 (mosquitto)
Tasks: 1 (limit: 4915)
CPU: 1min 37.976s
CGroup: /system.slice/mosquitto.service
└─546 /usr/sbin/mosquitto -c /etc/mosquitto/mosquitto.conf
Hi @reubenmiller, I'm using a sysvinit system via a self defined system.toml file. I've checked mosquitto and yes it's running.
@bjoernsauer Could you please post your system.toml
file so that we can try to reproduce it?
The restart command should invoke init 6
. Could you also try to send this manually (using sudo or root)?
Hi @reubenmiller The restart is not the problem manually calling init 6
performs well. The device does the reboot. After the reboot I see the same issue in the tedge_mapper c8y
output log:
2022-11-21T08:41:21.30833385Z INFO flockfile::unix: Lockfile created "/run/lock/tedge-mapper-c8y.lock"
2022-11-21T08:41:21.41281743Z INFO init: c8y_api::http_proxy: Initialisation
2022-11-21T08:41:31.41429271Z ERROR init: c8y_api::http_proxy: An error occurred while retrieving internal Id, operation will retry in 60 seconds and mapper will reinitialise.
Error: RequestTimeout
system.toml
[init]
name = "sysv"
is_available = ["/etc/tedge/system_helper.sh", "available"]
restart = ["/etc/tedge/system_helper.sh", "initd", "{}", "restart"]
stop = ["/etc/tedge/system_helper.sh", "initd", "{}", "stop"]
enable = ["/etc/tedge/system_helper.sh", "updaterc", "{}", "enable"]
disable = ["/etc/tedge/system_helper.sh", "updaterc", "{}", "disable"]
is_active = ["/etc/tedge/system_helper.sh", "initd", "{}", "status"]
script file system_helper.sh
#!/bin/sh
mode=$1
service=$2
arg=$3
systemctl_available () {
command -v systemctl >/dev/null 2>&1;
}
do_check_sysv_is_available () {
if systemctl_available; then
echo "systemd init detected."
exit 1
fi
if [ -L /sbin/init ]; then
if [ $(readlink /sbin/init) = "/sbin/init.sysvinit" ]; then
echo "SysV init detected."
exit 0;
elif [ $(readlink /sbin/init) = "/lib/systemd/systemd" ]; then
echo "systemd init detected."
exit 1
else
echo "No SysV init detected."
exit 2;
fi
# if init is not a symlink we have sysv
elif [ -f /sbin/init ]; then
echo "SysV init detected."
exit 0;
else
echo "No SysV init detected."
exit 3;
fi
}
do_initd_script_call () {
/etc/init.d/${service} $arg
exit $?
}
do_updaterc_call () {
if [ $arg = "enable" ]; then
update-rc.d $service defaults 98 20
update-rc.d $service enable
else
update-rc.d $service disable
fi
exit $?
}
if [ $mode = "available" ]; then
do_check_sysv_is_available
elif [ $mode = "initd" ]; then
do_initd_script_call
elif [ $mode = "updaterc" ]; then
do_updaterc_call
else
exit 1
fi
The logs are very useful there, could you modify how the tedge_mapper
is started to include the --debug
argument?
For example in systemd this would be editing the service definition (though your usage might differ slightly if you are using sysvinit)
File: /usr/lib/systemd/system/tedge-mapper-c8y.service
[Unit]
Description=tedge-mapper-c8y converts Thin Edge JSON measurements to Cumulocity JSON format.
After=syslog.target network.target mosquitto.service
[Service]
User=tedge
ExecStart=/usr/bin/tedge_mapper c8y --debug
Restart=on-failure
RestartPreventExitStatus=255
[Install]
WantedBy=multi-user.target
Hi @reubenmiller, here is the log after the device reboot. The error message repeats until I disconnect the device and connect ist again.
2022-11-24T10:58:41.1667011Z INFO flockfile::unix: Lockfile created "/run/lock/tedge-mapper-c8y.lock"
2022-11-24T10:58:41.25894187Z TRACE mio::poll: registering event source with poller: token=Token(1), interests=READABLE | WRITABLE
2022-11-24T10:58:41.26022658Z DEBUG rumqttc::state: Subscribe. Topics = [Filter = c8y/s/dat, Qos = AtLeastOnce], Pkid = 1
2022-11-24T10:58:41.26167645Z INFO init: c8y_api::http_proxy: Initialisation
2022-11-24T10:58:41.26224065Z DEBUG rumqttc::state: Publish. Topic = c8y/s/uat, Pkid = 2, Payload Size = 0
2022-11-24T10:58:51.26317512Z ERROR init: c8y_api::http_proxy: An error occurred while retrieving internal Id, operation will retry in 60 seconds and mapper will reinitialise.
Error: RequestTimeout
2022-11-24T10:59:41.26061356Z DEBUG rumqttc::state: Pingreq,
last incoming packet before 59996 millisecs,
last outgoing request before 59998 millisecs
2022-11-24T10:59:51.26517084Z DEBUG rumqttc::state: Publish. Topic = c8y/s/uat, Pkid = 3, Payload Size = 0
2022-11-24T11:00:01.26571797Z ERROR init: c8y_api::http_proxy: An error occurred while retrieving internal Id, operation will retry in 60 seconds and mapper will reinitialise.
Error: RequestTimeout
^C
admin@axcf2152:~$ cat tedge-mapper-c8y.log
2022-11-24T10:58:41.1667011Z INFO flockfile::unix: Lockfile created "/run/lock/tedge-mapper-c8y.lock"
2022-11-24T10:58:41.25894187Z TRACE mio::poll: registering event source with poller: token=Token(1), interests=READABLE | WRITABLE
2022-11-24T10:58:41.26022658Z DEBUG rumqttc::state: Subscribe. Topics = [Filter = c8y/s/dat, Qos = AtLeastOnce], Pkid = 1
2022-11-24T10:58:41.26167645Z INFO init: c8y_api::http_proxy: Initialisation
2022-11-24T10:58:41.26224065Z DEBUG rumqttc::state: Publish. Topic = c8y/s/uat, Pkid = 2, Payload Size = 0
2022-11-24T10:58:51.26317512Z ERROR init: c8y_api::http_proxy: An error occurred while retrieving internal Id, operation will retry in 60 seconds and mapper will reinitialise.
Error: RequestTimeout
2022-11-24T10:59:41.26061356Z DEBUG rumqttc::state: Pingreq,
last incoming packet before 59996 millisecs,
last outgoing request before 59998 millisecs
2022-11-24T10:59:51.26517084Z DEBUG rumqttc::state: Publish. Topic = c8y/s/uat, Pkid = 3, Payload Size = 0
2022-11-24T11:00:01.26571797Z ERROR init: c8y_api::http_proxy: An error occurred while retrieving internal Id, operation will retry in 60 seconds and mapper will reinitialise.
Error: RequestTimeout
admin@axcf2152:~$ cat tedge-mapper-c8y.log
2022-11-24T10:58:41.1667011Z INFO flockfile::unix: Lockfile created "/run/lock/tedge-mapper-c8y.lock"
2022-11-24T10:58:41.25894187Z TRACE mio::poll: registering event source with poller: token=Token(1), interests=READABLE | WRITABLE
2022-11-24T10:58:41.26022658Z DEBUG rumqttc::state: Subscribe. Topics = [Filter = c8y/s/dat, Qos = AtLeastOnce], Pkid = 1
2022-11-24T10:58:41.26167645Z INFO init: c8y_api::http_proxy: Initialisation
2022-11-24T10:58:41.26224065Z DEBUG rumqttc::state: Publish. Topic = c8y/s/uat, Pkid = 2, Payload Size = 0
2022-11-24T10:58:51.26317512Z ERROR init: c8y_api::http_proxy: An error occurred while retrieving internal Id, operation will retry in 60 seconds and mapper will reinitialise.
Error: RequestTimeout
2022-11-24T10:59:41.26061356Z DEBUG rumqttc::state: Pingreq,
last incoming packet before 59996 millisecs,
last outgoing request before 59998 millisecs
2022-11-24T10:59:51.26517084Z DEBUG rumqttc::state: Publish. Topic = c8y/s/uat, Pkid = 3, Payload Size = 0
2022-11-24T11:00:01.26571797Z ERROR init: c8y_api::http_proxy: An error occurred while retrieving internal Id, operation will retry in 60 seconds and mapper will reinitialise.
Error: RequestTimeout
2022-11-24T11:00:41.26194745Z DEBUG rumqttc::state: Pingreq,
last incoming packet before 49995 millisecs,
last outgoing request before 49996 millisecs
2022-11-24T11:01:01.26767379Z DEBUG rumqttc::state: Publish. Topic = c8y/s/uat, Pkid = 4, Payload Size = 0
2022-11-24T11:01:11.26934005Z ERROR init: c8y_api::http_proxy: An error occurred while retrieving internal Id, operation will retry in 60 seconds and mapper will reinitialise.
Error: RequestTimeout
2022-11-24T11:01:41.263582Z DEBUG rumqttc::state: Pingreq,
last incoming packet before 39994 millisecs,
last outgoing request before 39995 millisecs
2022-11-24T11:02:11.27117636Z DEBUG rumqttc::state: Publish. Topic = c8y/s/uat, Pkid = 5, Payload Size = 0
2022-11-24T11:02:21.27477699Z ERROR init: c8y_api::http_proxy: An error occurred while retrieving internal Id, operation will retry in 60 seconds and mapper will reinitialise.
Error: RequestTimeout
2022-11-24T11:02:41.26495802Z DEBUG rumqttc::state: Pingreq,
last incoming packet before 29992 millisecs,
last outgoing request before 29993 millisecs
2022-11-24T11:03:21.27674494Z DEBUG rumqttc::state: Publish. Topic = c8y/s/uat, Pkid = 6, Payload Size = 0
2022-11-24T11:03:31.27837968Z ERROR init: c8y_api::http_proxy: An error occurred while retrieving internal Id, operation will retry in 60 seconds and mapper will reinitialise.
Error: RequestTimeout
2022-11-24T11:03:41.26622072Z DEBUG rumqttc::state: Pingreq,
last incoming packet before 19987 millisecs,
last outgoing request before 19989 millisecs
Looks like we have found the root cause! Seems to be a problem with the mosquitto regarding a bridge connection in version 2.0.15
, which results in the bridge not connecting every 2nd time after a reboot.
Looking at the mosquitto logs /var/log/mosquitto/mosquitto.log
, it shows an Expiring client
error on startup.
1669879960: mosquitto version 2.0.15 running
1669879960: Expiring client Cumulocity due to timeout.
Then when rebooting, it works again. Below shows the happy case. But a subsequent reboot would result in the Expiring client
error again.
1669881815: Opening ipv4 listen socket on port 1883.
1669881815: Connecting bridge (step 1) edge_to_c8y (mytenant.eu-latest.cumulocity.com:8883)
1669881815: mosquitto version 2.0.15 running
1669881815: Connecting bridge (step 2) edge_to_c8y (certification-phoenixcontact.eu-latest.cumulocity.com:8883)
Here is the ticket from the mosquitto repo https://github.com/eclipse/mosquitto/issues/2634
2.0.11
@bjoernsauer Did downgrading to mosquitto 2.0.11
solve your issue? If so can you please close this ticket?
Downgraded mosquitto from 2.0.15 to 2.0.11. So far the issue did not occure again. Thank you for your excellent support @reubenmiller
We have the same problem at our installations also. We downgraded to 2.0.14 at that solved it.
Describe the bug
Performing a device restart from cumulocity restarts the device. However the device is not able to connect to the tenant after the restart. Log output of the tedge-mapper-c8y:
The error repeats until a manual reboot. After the manual reboot the device can connect to cumulocity again without an error.
Log after manual reboot:
To Reproduce
Issue a restart command.
Expected behavior
The device should connect after a restart command.
Environment: