Stefal / rtkbase

Your own GNSS base station for RTK localization with a Web GUI
GNU Affero General Public License v3.0
467 stars 113 forks source link

str2str_tcp service sometimes stop after boot #417

Open Stefal opened 1 month ago

Stefal commented 1 month ago

Describe the bug str2str_tcp service sometimes stop after boot

To Reproduce Steps to reproduce the behavior:

  1. Boot or reboot the SBC
  2. After the boot, the main service is stopped

log:

-- Boot ecb0805ef4be43ec8990eb8466f687c8 --
Jul 23 14:35:28 basegnss systemd[1]: Starting str2str_tcp.service - RTKBase Tcp...
Jul 23 14:35:29 basegnss run_cast.sh[726]: stream server start
Jul 23 14:35:29 basegnss systemd[1]: str2str_tcp.service: Deactivated successfully.
Jul 23 14:35:29 basegnss systemd[1]: Started str2str_tcp.service - RTKBase Tcp.

Expected behavior The service should not stop

Software/Hardware (please complete the following information):

Additional context I've tried to delayed the web service startup, without any behaviour change.

It's difficult to reproduce. Yesterday, I've done countless reboot or power off/on without seeing this behaviour. Today I reproduced it on 2 differents base stations at the same time. 🤔

Stefal commented 1 month ago

I've reproduced this bug with a service in which i've disabled the log limit settings and I saw that:

-- Boot 5535106d99714a3eaebb432143e6da2c --
Jul 26 07:49:13 basegnss systemd[1]: Starting str2str_tcp.service - RTKBase Tcp...
Jul 26 07:49:14 basegnss run_cast.sh[793]: stream server start
Jul 26 07:49:14 basegnss run_cast.sh[793]: stream server start error
Jul 26 07:49:14 basegnss systemd[1]: str2str_tcp.service: Deactivated successfully.
Jul 26 07:49:14 basegnss systemd[1]: Started str2str_tcp.service - RTKBase Tcp.

It looks like the ttyGNSS serial port was not ready when str2str_tcp started. But, the service should have failed, and tried to restart, as with this example when I set a wrong port in settings.conf:

Jul 26 08:01:10 basegnss systemd[1]: Starting str2str_tcp.service - RTKBase Tcp...
Jul 26 08:01:10 basegnss systemd[1]: Started str2str_tcp.service - RTKBase Tcp.
Jul 26 08:01:10 basegnss run_cast.sh[1845]: stream server start
Jul 26 08:01:10 basegnss run_cast.sh[1845]: stream server start error
Jul 26 08:01:10 basegnss systemd[1]: str2str_tcp.service: Main process exited, code=exited, status=1/FAILURE
Jul 26 08:01:10 basegnss systemd[1]: str2str_tcp.service: Failed with result 'exit-code'.

What is strange in the first failure, is that the service output "Started" after it was deactivated. Same failure log with -o short-precise

Jul 26 07:49:13.475660 basegnss systemd[1]: Starting str2str_tcp.service - RTKBase Tcp...
Jul 26 07:49:14.252704 basegnss run_cast.sh[793]: stream server start
Jul 26 07:49:14.272914 basegnss run_cast.sh[793]: stream server start error
Jul 26 07:49:14.284684 basegnss systemd[1]: str2str_tcp.service: Deactivated successfully.
Jul 26 07:49:14.285420 basegnss systemd[1]: Started str2str_tcp.service - RTKBase Tcp.