jmccrohan / pysolarmanv5

A python module to interact with Solarman Data Logging Sticks
MIT License
131 stars 28 forks source link

Handle timeout in data receiver and amend connection closing #52

Closed sofkaski closed 7 months ago

sofkaski commented 7 months ago

The origin of these changes is an attempt to find solution/workaround to a problem in an environment with Deye SUN-8K-SG04P3-EU inverter and a logger with firmware version LSW3_15_FFFF_1.0.9E when using Solarman Home Assistant integration. See my comments in the discussion in home_assistant_solarman issue #523.

This PR contains three group of changes (commits):

  1. Add more debug logging (that was used to investigate the problem in the environment)
  2. Add better handling of timeout and queue empty exceptions
  3. Fix socket closing

Some of these changes might make sense also on the async side, but I have no feasible environment to test those and therefore I did not touch the async side at all.

sofkaski commented 7 months ago

There are linter findings (too long lines, lazy style logging text formatting, ...) @jmccrohan If you wish, I could try to address them in a separate PR.

sofkaski commented 7 months ago

This was my first PR for this component. How do you deal with version numberin etc. Should it be part of the PR, or done separately?

githubDante commented 7 months ago

Hi,

Good job. Thank you. Looking at the comments in the said discussion, the descriptors leak seem to be caused by a combination of datalogger firmware bug / bad wi-fi connection and issue in pysolarmanv5. Adding a generic handler except Exception as e:... below the TimeoutError handling should improve the situation even further. Another improvement can be added here

https://github.com/jmccrohan/pysolarmanv5/blob/fb5169eba4aff03b7cf6c9c0389ca15440f5ad8d/pysolarmanv5/pysolarmanv5.py#L344

by removing the self._reader_thr.is_alive() check and calling socket.close() in try/except block all the time on reconnect.

As for the async part, I use the aysnc version extensively and I don't see FD leaks there.

sofkaski commented 7 months ago

I pushed fixes. I have them running in my system atm.

sofkaski commented 7 months ago

I pushed fixes. I have them running in my system atm.

Six hours so far successful. One time out failure which was recovered.

sofkaski commented 7 months ago

There has been one failure in 24 hours. Related logs:

2024-04-18 08:05:59.690 WARNING (MainThread) [homeassistant.helpers.entity] Update of sensor.solarman_pv1_power is taking over 10 seconds
2024-04-18 08:06:04.730 DEBUG (SyncWorker_11) [custom_components.solarman.solarman] Got exception when receiving frame: 
2024-04-18 08:06:04.730 DEBUG (SyncWorker_11) [custom_components.solarman.solarman] Disconnecting from server after queue empty exception
2024-04-18 08:06:04.731 INFO (SyncWorker_11) [custom_components.solarman.solarman] Disconnecting from solarman data logger 192.168.68.114:8899 connected via ('192.168.68.121', 33762)
2024-04-18 08:06:24.733 WARNING (SyncWorker_11) [custom_components.solarman.solarman] Querying [3-89 (0x0003-0x0059)] failed, [1] retry attempts left
2024-04-18 08:06:24.734 INFO (SyncWorker_11) [custom_components.solarman.solarman] Connecting to solarman data logger 192.168.68.114:8899
2024-04-18 08:06:39.751 DEBUG (SyncWorker_11) [custom_components.solarman.solarman] Socket creation failed: timed out
2024-04-18 08:06:39.756 DEBUG (SyncWorker_11) [custom_components.solarman.solarman] Got NoSocketAvailable exception (No socket available) at ['  File "/config/custom_components/solarman/solarman.py", line 97, in get_statistics\n    self.connect_to_server()\n', '  File "/config/custom_components/solarman/solarman.py", line 43, in connect_to_server\n    self._modbus = PySolarmanV5(self._host, self._serial, port=self._port, mb_slave_id=self._mb_slaveid, logger=log, auto_reconnect=True, socket_timeout=15)\n                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/usr/local/lib/python3.12/site-packages/pysolarmanv5/pysolarmanv5.py", line 120, in __init__\n    self._socket_setup(kwargs.get("socket"), kwargs.get("auto_reconnect", False))\n', '  File "/usr/local/lib/python3.12/site-packages/pysolarmanv5/pysolarmanv5.py", line 446, in _socket_setup\n    raise NoSocketAvailableError("No socket available")\n']
2024-04-18 08:06:39.756 WARNING (SyncWorker_11) [custom_components.solarman.solarman] Querying [3-89 (0x0003-0x0059)] failed, [0] retry attempts left
2024-04-18 08:06:39.756 WARNING (SyncWorker_11) [custom_components.solarman.solarman] Querying registers [3-89 (0x0003-0x0059)] failed, aborting.

In the next update round in 10 min connection succeeds again normally.

jmccrohan commented 7 months ago

Thanks @sofkaski for identifying this and for creating the PR. Happy to accept any linter fixes as a separate PR.

@githubDante thanks for reviewing.

Merged.