siemens / meta-iot2050

SIMATIC IOT2050 Isar/Debian Board Support Package
MIT License
131 stars 77 forks source link

Network not responding #549

Open sandumarius opened 5 months ago

sandumarius commented 5 months ago

Hello,

We have an architecture where IOT2050 reads data from a PLC using Modbus TCP (cycle 1s) and sends it via MQTT to a AWS private server. We use NodeRed for data reading and manipulation. We have configured 2 networks: 1 - connected to PLC AB - static IP 2 - connected to a MUM853-1 - static IP and gateway The problem is that after a period of time ( sometimes after the router loses GSM signal) the network is not responding. We can see the device in ARP list of router but we cannot ping or ssh into it. After restart everything is back again working. System logs are here: https://s.go.ro/w3np1a4d | password: 539737 What can be the cause of this problem? Thank you!

chombourger commented 5 months ago

pretty large logs!

any idea when the problem occurred (estimated date/time)? /var/log/messages is ~350k lines

I would suggest running tcpdump on the iot2050 and router (or another machine being on the same network) to check how much they are talking when you start having network hiccups

would you be able to connect a monitor and keyboard to the iot2050 and open a shell? could you alternatively open a shell via the debug console?

chombourger commented 5 months ago

NetworkManager logs (journalctl -D log/journal -u NetworkManager) do not seem to exhibit anything abnormal. In particular, I was checking for log entries just before a -- Boot line (since you had indicated that you were forced to reboot the IOT2050 to restore the network connection. Logs that I see in your archive:

Jun 10 21:28:38 IOT NetworkManager[676]: <info>  [1718047718.4917] policy: set 'eno2-connection' (eno2) as default for IPv4 routing and DNS
Jun 10 21:28:38 IOT NetworkManager[676]: <info>  [1718047718.5026] device (eno2): Activation: successful, device activated.
Jun 10 21:28:38 IOT NetworkManager[676]: <info>  [1718047718.5066] manager: NetworkManager state is now CONNECTED_GLOBAL
-- Boot 7b0617bd9c7848e59c6cc47fa6d68b4e --
chombourger commented 5 months ago

Is the IP address you are trying to ping / ssh-to 192.168.0.50?

sandumarius commented 5 months ago

We have set the following IP1:192.168.1.50 /24- Static IP2:192.168.0.50/24 - Static with 192.168.0.1 gateway (MUM) IP2 is routed via MUM to have access from the private APN. The last problem occurred on Wed, 5 Jun 2024 at 19:14 . This is the date where the IOT lost the connection with AWS. I had this problem in the past (i do not know the date exactly) but it was the same reaction. The network did not respond to ping. I tried to connect directly with the cable to avoid the router and no answer. I tried to connect to IP1 and it worked . After restart everything was in order. I uploaded also the MUM logs.

startup_SCALANCE_M800.zip

sandumarius commented 4 months ago

Hello , we have another incident : https://s.go.ro/cbqskpz1 | password: 477128 Date: Fri, 28 Jun 2024 at 01:48 What can we do ? We already have a cron script to restart but it seems that this does not solve the problem.

Edit: I found the first occurence: Jun 28 01:03:12 IOT Node-RED[712]: 28 Jun 01:03:12 - [error] [modbus-client:AB] Error: connect ETIMEDOUT 192.168.1.10:502 And after the restart at 11 the communication reestablished. It seems that only the ETH1 was affected. Thank you!