home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
71.88k stars 30.11k forks source link

SMA Inverter running modbus fails after time #120511

Closed florie1706 closed 2 weeks ago

florie1706 commented 3 months ago

The problem

When using a docker image later then 2024.3.3 the modbus connection will fail after about two days and just comes back up when restarting the inverter and HA.

What version of Home Assistant Core has the issue?

core-2024.6.3

What was the last working version of Home Assistant Core?

core-2024.3.3

What type of installation are you running?

Home Assistant Container

Integration causing the issue

modbus

Link to integration documentation on our website

http://home-assistant.io/integrations/modbus/

Diagnostics information

No response

Example YAML snippet

modbus:
- name: SMA_Wechselrichter
  type: tcp
  host: 10.10.10.11
  port: 502
  sensors:
   - name: SMA_STP10000TL_10_Gesamtertrag
     unique_id: sma_stp10000tl_10_gesamtertrag
     state_class: total_increasing
     device_class: energy
     unit_of_measurement: kWh
     scan_interval: 10
     slave: 3
     address: 30529
     data_type: int32
     scale: 0.001
     precision: 3
   - name: SMA_STP10000TL_10_Leistung
     unique_id: sma_stp10000tl_10_leistung
     device_class: power
     unit_of_measurement: W
     scan_interval: 10
     slave: 3
     address: 30775
     data_type: int32
   - name: Wechselrichter Tagesertrag
     unique_id: sma_stp10000tl_10_tagesertrag
     device_class: energy
     unit_of_measurement: kWh
     scan_interval: 30
     slave: 3
     address: 30517
     data_type: int64
     scale: 0.001
     precision: 3

Anything in the logs that might be useful for us?

2024-06-26 10:01:34.760 DEBUG (MainThread) [pymodbus.logging] Connecting to 10.10.10.11:502.
2024-06-26 10:01:34.760 DEBUG (MainThread) [pymodbus.logging] Connecting comm
2024-06-26 10:01:37.760 WARNING (MainThread) [pymodbus.logging] Failed to connect 
2024-06-26 10:01:37.760 INFO (MainThread) [homeassistant.components.modbus.modbus] modbus SMA_Wechselrichter communication open
2024-06-26 10:01:38.726 ERROR (MainThread) [homeassistant.components.modbus.modbus] Pymodbus: SMA_Wechselrichter: Error: device: 3 address: 30529 -> Modbus Error: [Connection] Not connected[AsyncModbusTcpClient 10.10.10.11:502]
2024-06-26 10:01:38.726 DEBUG (MainThread) [homeassistant.components.modbus.modbus] Pymodbus: SMA_Wechselrichter: Error: device: 3 address: 30775 -> Modbus Error: [Connection] Not connected[AsyncModbusTcpClient 10.10.10.11:502]
2024-06-26 10:01:38.726 DEBUG (MainThread) [homeassistant.components.modbus.modbus] Pymodbus: SMA_Wechselrichter: Error: device: 3 address: 30517 -> Modbus Error: [Connection] Not connected[AsyncModbusTcpClient 10.10.10.11:502]

Additional information

The thing is @janiversen I know you will answer like there is nothing to see in the debug log. But actually the problem persists since 2024.4.0

That said my new test was to have two docker containers of HA running at the same time. First container is running 2024.3.3 which still runs great even with the inverter data. Second container runs the latest HA version and there the inverter data stopped yesterday (after two days where data was received) Log file is from the 2024.6.43 container which does not receive any data anymore. A restart of the second container did not bring back the data even the first container still gets updates thru modbus. Only way for container two to get data again was restarting the inverter and the container afterwards.

I am sorry to not provide better debug logs but that is what I get out of the debug function.

I really think it is still software side of the modbus integration (update after 2024.3.3) as the hardware connection is still there in the 2024.3.3 container without any interruption.

Anyting I could further contribute?

janiversen commented 3 months ago

You are right what do you want me to respond differently.

I have never disputed it might be a software problem, but there are a lot more software in play than just the modbus integration. Between the 2 versions even python itself was changed.

I am sorry, but I have no idea where I should start looking for a problem within the modbus integration since there are no indications that it does something it should not do, all I can say is that at least some users including myself do not have a problem.

florie1706 commented 3 months ago

That is after reloading the inverter and the integration of the 2024.6.3 container, gets connected again immediately.

Will investigate further.

2024-06-26 10:48:17.983 INFO (MainThread) [homeassistant.components.modbus] Modbus reloading
2024-06-26 10:48:17.983 WARNING (MainThread) [homeassistant.components.modbus.modbus] modbus SMA_Wechselrichter communication closed
2024-06-26 10:48:17.983 DEBUG (MainThread) [pymodbus.logging] Connecting to 10.10.10.11:502.
2024-06-26 10:48:17.984 DEBUG (MainThread) [pymodbus.logging] Connecting comm
2024-06-26 10:48:17.992 DEBUG (MainThread) [pymodbus.logging] Connected to comm
2024-06-26 10:48:17.992 DEBUG (MainThread) [pymodbus.logging] Resetting frame - Current Frame in buffer - 
2024-06-26 10:48:17.992 INFO (MainThread) [homeassistant.components.modbus.modbus] modbus SMA_Wechselrichter communication open
2024-06-26 10:48:19.597 DEBUG (MainThread) [pymodbus.logging] Adding transaction 1
2024-06-26 10:48:19.597 DEBUG (MainThread) [pymodbus.logging] Resetting frame - Current Frame in buffer - 
2024-06-26 10:48:19.597 DEBUG (MainThread) [pymodbus.logging] send: 0x0 0x1 0x0 0x0 0x0 0x6 0x3 0x3 0x77 0x41 0x0 0x2
2024-06-26 10:48:19.685 DEBUG (MainThread) [pymodbus.logging] recv: 0x0 0x1 0x0 0x0 0x0 0x7 0x3 0x3 0x4 0x8 0x29 0xe2 old_data:  addr=None
2024-06-26 10:48:19.685 DEBUG (MainThread) [pymodbus.logging] Processing: 0x0 0x1 0x0 0x0 0x0 0x7 0x3 0x3 0x4 0x8 0x29 0xe2
2024-06-26 10:48:19.685 DEBUG (MainThread) [pymodbus.logging] Frame check failed, missing part of message len 12, MBAP len 7 !!
2024-06-26 10:48:19.686 DEBUG (MainThread) [pymodbus.logging] recv: 0x3e old_data:  addr=None
2024-06-26 10:48:19.686 DEBUG (MainThread) [pymodbus.logging] Processing: 0x0 0x1 0x0 0x0 0x0 0x7 0x3 0x3 0x4 0x8 0x29 0xe2 0x3e
2024-06-26 10:48:19.686 DEBUG (MainThread) [pymodbus.logging] Factory Response[ReadHoldingRegistersResponse': 3]
2024-06-26 10:48:19.686 DEBUG (MainThread) [pymodbus.logging] Getting transaction 1
2024-06-26 10:48:19.686 DEBUG (MainThread) [pymodbus.logging] Frame check, no more data!
2024-06-26 10:48:19.687 DEBUG (MainThread) [pymodbus.logging] Adding transaction 2
2024-06-26 10:48:19.687 DEBUG (MainThread) [pymodbus.logging] Resetting frame - Current Frame in buffer - 
2024-06-26 10:48:19.687 DEBUG (MainThread) [pymodbus.logging] send: 0x0 0x2 0x0 0x0 0x0 0x6 0x3 0x3 0x78 0x37 0x0 0x2
2024-06-26 10:48:19.835 DEBUG (MainThread) [pymodbus.logging] recv: 0x0 0x2 0x0 0x0 0x0 0x7 0x3 0x3 0x4 0x0 0x0 0x19 old_data:  addr=None
2024-06-26 10:48:19.835 DEBUG (MainThread) [pymodbus.logging] Processing: 0x0 0x2 0x0 0x0 0x0 0x7 0x3 0x3 0x4 0x0 0x0 0x19
2024-06-26 10:48:19.835 DEBUG (MainThread) [pymodbus.logging] Frame check failed, missing part of message len 12, MBAP len 7 !!
2024-06-26 10:48:19.836 DEBUG (MainThread) [pymodbus.logging] recv: 0xe7 old_data:  addr=None
2024-06-26 10:48:19.836 DEBUG (MainThread) [pymodbus.logging] Processing: 0x0 0x2 0x0 0x0 0x0 0x7 0x3 0x3 0x4 0x0 0x0 0x19 0xe7
2024-06-26 10:48:19.836 DEBUG (MainThread) [pymodbus.logging] Factory Response[ReadHoldingRegistersResponse': 3]
2024-06-26 10:48:19.836 DEBUG (MainThread) [pymodbus.logging] Getting transaction 2
2024-06-26 10:48:19.836 DEBUG (MainThread) [pymodbus.logging] Frame check, no more data!
2024-06-26 10:48:29.597 DEBUG (MainThread) [pymodbus.logging] Adding transaction 3
2024-06-26 10:48:29.597 DEBUG (MainThread) [pymodbus.logging] Resetting frame - Current Frame in buffer - 
2024-06-26 10:48:29.598 DEBUG (MainThread) [pymodbus.logging] send: 0x0 0x3 0x0 0x0 0x0 0x6 0x3 0x3 0x77 0x41 0x0 0x2
2024-06-26 10:48:29.755 DEBUG (MainThread) [pymodbus.logging] recv: 0x0 0x3 0x0 0x0 0x0 0x7 0x3 0x3 0x4 0x8 0x29 0xe2 old_data:  addr=None
2024-06-26 10:48:29.755 DEBUG (MainThread) [pymodbus.logging] Processing: 0x0 0x3 0x0 0x0 0x0 0x7 0x3 0x3 0x4 0x8 0x29 0xe2
2024-06-26 10:48:29.755 DEBUG (MainThread) [pymodbus.logging] Frame check failed, missing part of message len 12, MBAP len 7 !!
2024-06-26 10:48:29.756 DEBUG (MainThread) [pymodbus.logging] recv: 0x51 old_data:  addr=None
2024-06-26 10:48:29.756 DEBUG (MainThread) [pymodbus.logging] Processing: 0x0 0x3 0x0 0x0 0x0 0x7 0x3 0x3 0x4 0x8 0x29 0xe2 0x51
2024-06-26 10:48:29.756 DEBUG (MainThread) [pymodbus.logging] Factory Response[ReadHoldingRegistersResponse': 3]
2024-06-26 10:48:29.756 DEBUG (MainThread) [pymodbus.logging] Getting transaction 3
2024-06-26 10:48:29.756 DEBUG (MainThread) [pymodbus.logging] Frame check, no more data!
2024-06-26 10:48:29.756 DEBUG (MainThread) [pymodbus.logging] Adding transaction 4
2024-06-26 10:48:29.757 DEBUG (MainThread) [pymodbus.logging] Resetting frame - Current Frame in buffer - 
2024-06-26 10:48:29.757 DEBUG (MainThread) [pymodbus.logging] send: 0x0 0x4 0x0 0x0 0x0 0x6 0x3 0x3 0x78 0x37 0x0 0x2
2024-06-26 10:48:29.905 DEBUG (MainThread) [pymodbus.logging] recv: 0x0 0x4 0x0 0x0 0x0 0x7 0x3 0x3 0x4 0x0 0x0 0x1a old_data:  addr=None
2024-06-26 10:48:29.906 DEBUG (MainThread) [pymodbus.logging] Processing: 0x0 0x4 0x0 0x0 0x0 0x7 0x3 0x3 0x4 0x0 0x0 0x1a
2024-06-26 10:48:29.906 DEBUG (MainThread) [pymodbus.logging] Frame check failed, missing part of message len 12, MBAP len 7 !!
2024-06-26 10:48:29.907 DEBUG (MainThread) [pymodbus.logging] recv: 0x1 old_data:  addr=None
2024-06-26 10:48:29.907 DEBUG (MainThread) [pymodbus.logging] Processing: 0x0 0x4 0x0 0x0 0x0 0x7 0x3 0x3 0x4 0x0 0x0 0x1a 0x1
2024-06-26 10:48:29.908 DEBUG (MainThread) [pymodbus.logging] Factory Response[ReadHoldingRegistersResponse': 3]
2024-06-26 10:48:29.908 DEBUG (MainThread) [pymodbus.logging] Getting transaction 4
2024-06-26 10:48:29.908 DEBUG (MainThread) [pymodbus.logging] Frame check, no more data!
janiversen commented 3 months ago

Reloading both does not help identify a problem, test if reloading just one of parts make it work again.

Or did you mean reloading any of the parts make it work again?

florie1706 commented 3 months ago

Container with 2024.3.3 runs smoothly the whole time. No interaction needed.

Container with 2024.6.3 (or better said newer than 2024.3.3) stopped working (usually after 2 - 6 days), restarted the inverter as no connection is possible any longer (even with a restart/reload of the 2024.6.3 container). After the inverter restarted I reloaded only the 2024.6.3 container which is seen above.

When I am right, in the image I use (linuxserver/homeassistant) python was upgraded to 3.12 in 2024.2 and since that image, it stayed with python 3.12

janiversen commented 3 months ago

So it seems clearly your device is blocking reconnect..

the reason it disconnects can be too long between reads the reason it does not accept a new a new connect is because it happens too fast.

anyhow this is way beyond the modbus integration.

florie1706 commented 3 months ago

Maybe true, but why is, at the same time, the container with 2024.3.3 still able to connect (even a restart of the container does not change that) without error? In the network both containers have the same ip from the server host. That said I can fire up more containers with 2024.3.3 and 2024.6.3, all 2024.3.3 containers get a connection while the others do not.

Any option you could provide to have the time between reads changed? That would be great.

janiversen commented 3 months ago

There are a lot of technical reasons why the old version works and the new not, basically it's because more or less everything changed.

This channel is for bugs not for support, I cannot help you more, I have given you several pointers..if you find something (fact) that points at the modbus integration I am happy to help.

rayxxxxx commented 3 months ago

I’ve got same problem with my integration:

homeassistant.components.modbus.modbus Fonte: components/modbus/modbus.py:331 Integrazione: Modbus Prima occorrenza: 06:39:27 (1 occorrenze) Ultima registrazione: 06:39:27

Pymodbus: sma: Error: device: 3 address: 30053 -> Modbus Error: [Connection] Not connected[AsyncModbusTcpClient 192.168.XXX.XXX:502]

florie1706 commented 3 months ago

We won’t get any help here. I think you have to live with that. Tried it a couple of times …

janiversen commented 3 months ago

You will get bugs solved here, but not support....however I am still waiting for some indication that it is a problem in the modbus integration.

janiversen commented 2 months ago

Now here is an example of how a user with a problem that helps the maintainer with a good analysis, giving me something to work with !!!

119649

Gnarfoz commented 2 months ago

Are you running those two versions of Home Assistant in parallel, connecting to the same inverter? The SMA Sunny Tripower series is co-developed (the hardware is nearly identical) with Sungrow's SHxRT series of hybrid inverters. They're infamous for being allergic to being accessed by more than one client because that can result in overlapping requests. If the SMA version of the inverter has the same underlying issue, the same workaround might help: try putting a Modbus proxy that serializes requests from multiple clients between the inverter and the client(s), for example: https://github.com/tiagocoutinho/modbus-proxy https://github.com/tiagocoutinho/modbus-proxy-rs

Configure the proxy to connect to the inverter, configure the clients (like HA) to connect to the proxy, not directly to the inverter.

florie1706 commented 2 months ago

@Gnarfoz No I don't. That was just for testing. Usually I only connect with one client (one instance of HA) since ages, but it turns out that any version above HA 2024.3.3 starts disconnecting after time.

Will further investigate and also check janiversen's post when I am home.

edit: Will capture network traffic for the device, hopefully that helps more :)

florie1706 commented 2 months ago

Ok, here you go @janiversen hope this helps better.

The connection dropped in HA at 11:15:30 today.

DEBUG (MainThread) [homeassistant.components.modbus.modbus] Pymodbus: SMA_Wechselrichter: Error: device: 3 address: 30775 -> Modbus Error: [Connection] Not connected[AsyncModbusTcpClient 10.10.20.151:502]

I already changed the scan interval couple of days ago:

- name: SMA_Wechselrichter
  type: tcp
  host: 10.10.20.151
  port: 502
  sensors:
   - name: Wechselrichter Gesamtertrag
     unique_id: sma_stp10000tl_10_gesamtertrag
     state_class: total_increasing
     device_class: energy
     unit_of_measurement: kWh
     scan_interval: 31
     slave: 3
     address: 30529
     data_type: int32
     scale: 0.001
     precision: 3
   - name: Wechselrichter Leistung
     unique_id: sma_stp10000tl_10_leistung
     device_class: power
     unit_of_measurement: W
     min_value: 0
     scan_interval: 6
     slave: 3
     address: 30775
     data_type: int32
   - name: Wechselrichter Tagesertrag
     unique_id: sma_stp10000tl_10_tagesertrag
     device_class: energy
     unit_of_measurement: kWh
     scan_interval: 55
     slave: 3
     address: 30517
     data_type: int64
     scale: 0.001
     precision: 3

I just have theses three entities.

janiversen commented 2 months ago

You did not supply the modbus debug log, so I cannot relate the network traffic to what happens in modbus. With the debug file I would see the transaction number in the last valid message, which then would be easy to find in the network file.

You are asking me to find a needle in a haystack, without knowing what a needle looks like.

The only thing I can see is that your device closes the connection and do not accept new connection requests.

florie1706 commented 2 months ago

My bad sorry. Forgot to switch on the debug log. Already switched that back on and will also log traffic again. Let’s wait till the error appears again.

janiversen commented 2 months ago

Good luck, I am no longer an active maintainer of the modbus integration, instead I will maintain a custom component.

littleyoda commented 2 months ago

Maybe my custom integration for SMA-Devices is a solution for you: https://community.home-assistant.io/t/support-for-more-sma-devices-e-g-tripower-x/713737 https://www.photovoltaikforum.com/thread/225270-home-assistant-und-sma-ger%C3%A4te/

gagman10 commented 1 month ago

@florie1706: Has the error occurred again? I have the same problem with my SMA inverter and Modbus. I tried SMA WebConnect as an alternative, but my inverter seems to be too old (SMA SunnyBoy 5000 TL). At the moment I am regularly restarting my inverter.

florie1706 commented 1 month ago

@janiversen If you are willing to check again, here are the logs:

Bild 01 08 24 um 07 47

After hard restart of the inverter and reload of the modbus integration it came back online in HA.

@littleyoda Saw that already and tested it successfully. I just had the modbus integration running for 2 years without a problem and as I need it for my walllboxes I tried to have as less integrations running as possible. Why does it say "not for production use"? This irritates me :)

@gagman10 See above. I changed the scan intervals to different times, so the error does not appear as often, but still exists and the only way to get rid of it is to restart the inverter or use HA 2024.3.3

littleyoda commented 1 month ago

Why does it say "not for production use"? This irritates me :)

the hint was a relic from the first versions. I will remove it in the next version. The solution is now stable enough.

One thing more to note (probably not relevant to the problems): There are reports in a German PV forum that Modbus stopped working after the last firmware update and the inverter (in this case tripower X) had to be restarted.

florie1706 commented 1 month ago

My inverter is from 2011. So it is on the latest firmware since very long time (last and only one was before I started using HA and has to be done manually via SD card)

BrettLynch123 commented 1 month ago

@florie1706 Hi I just managed to fix this problem with my sma inverter. I had to add delay and timeout settings like this. hope this helps you. modbus:

florie1706 commented 1 month ago

Will give this a try. Thanks :)

florie1706 commented 2 weeks ago

Thanks to @BrettLynch123 seems like it is working as expected again. Thanks to your config update :)