home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
73.93k stars 30.99k forks source link

Command PR with value A raised exception #18954

Closed bartandeweg closed 5 years ago

bartandeweg commented 5 years ago

Home Assistant release with the issue:

0.83.2

Last working Home Assistant release (if known): 0.81.0

Operating environment (Hass.io/Docker/Windows/etc.): Hassio

Component/platform: Opentherm Gateway @mvn23

Description of problem: The openthermGateway is connecting as it supposed to but after a while (sometimes just 1 minute, sometimes over a few hours the connection is lost.

Getting this error in the log: 2018-12-03 08:49:26 ERROR (MainThread) [pyotgw.pyotgw] Command PR with value A raised exception: Syntax Error: The command contained an unexpected character or was incomplete.

Problem-relevant configuration.yaml entries and (fill out even if it seems unimportant):

device: /dev/ttyUSB0
climate:
  name: CV Thermostaat
  precision: 0.5
  floor_temperature: True
monitored_variables:
  - room_setpoint
  - room_temp
  - dhw_setpoint
  - dhw_temp

Traceback (if applicable):

Additional information:

mvn23 commented 5 years ago

Thanks for the report. I haven't seen this behavior myself, so I'd like to get some more info.

The usual reason for this error message is that a malformed command reached the gateway. Normal behavior in such case is to log an error and ignore the command that was given. The platform should continue to behave normally after such an event (except that the issued command has no effect). I am planning to implement retry functionality for failed commands and reconnect for lost connections, but haven't had a lot of time for it recently.

bartandeweg commented 5 years ago

Hi there,

  1. The gateway is connected thru USB using an serial over ethernet connector
  2. The error message (PR with value A) is showing up after restarting hassio. After a few hours the connection is just lost without any message.
  3. After this error (PR with value A) is still usable. Will check the log when the platform becomes unresponsive again.
bartandeweg commented 5 years ago

Did check the logs. But no message comes up while loosing its connection.

mvn23 commented 5 years ago

There have been more reports of lost/bad connectivity when using this platform over a network connection, mostly related to esp8266 devices. I took some time to implement command retry and inactivity reconnect logic in the library that is being used. I have tested it for functionality with syntax errors and lost connections and I have been test driving it in my home setup for more than a day now (without any retries/reconnects) so it should be usable. I would like to test it some more before pushing any updates, but if you want you can try out the new code at mvn23/pyotgw:retry_reconnect.

bartandeweg commented 5 years ago

Where should i put the new code?

mvn23 commented 5 years ago

Just overwrite the existing .py files. I'm not sure where they are placed on Hassio, but in my virtualenv install on debian it's under <venv>/lib/python3.5/site-packages/pyotgw/. To revert to the distributed version, just remove that pyotgw folder and the pyotgw-0.3b1.dist-info folder next to it and restart Home Assistant.

bartandeweg commented 5 years ago

Will wait until it is embedded in an update. Cant find the files directly in Hassio.

Jorei commented 5 years ago

There have been more reports of lost/bad connectivity when using this platform over a network connection, mostly related to esp8266 devices. I took some time to implement command retry and inactivity reconnect logic in the library that is being used. I have tested it for functionality with syntax errors and lost connections and I have been test driving it in my home setup for more than a day now (without any retries/reconnects) so it should be usable. I would like to test it some more before pushing any updates, but if you want you can try out the new code at mvn23/pyotgw:retry_reconnect.

Hi mvn23, I had the exact same error bartandeweg subscribe, trying your 'retry_reconnect' branch solved the problem for me!

EdmarS commented 5 years ago

I have the same issue. I have the OTGW connected with a NodeMCU and hooked up with HASSIO using otmonitor. otmonitor is working fine and I receive the information through my MQTT broker (events/central_heating/otmonitor/#). Although I am not able to change the setpoint temperature using MQTT (actions/central_heating/otmonitor/setpoint, but may be this is the wrong action command?)

Now I'm trying to use the opentherm_gw, but without success. The logging shows the following warnings/error:

2018-12-28 17:59:45 ERROR (MainThread) [pyotgw.pyotgw] Command PR with value A raised exception: Syntax Error: The command contained an unexpected character or was incomplete.
2018-12-28 17:59:46 ERROR (MainThread) [pyotgw.pyotgw] Command PR with value B raised exception: Syntax Error: The command contained an unexpected character or was incomplete.
2018-12-28 17:59:56 ERROR (MainThread) [pyotgw.pyotgw] Timed out waiting for command: PR, value: C. Are you connecting to the OpenTherm Gateway?
2018-12-28 18:00:06 ERROR (MainThread) [pyotgw.pyotgw] Timed out waiting for command: PR, value: G. Are you connecting to the OpenTherm Gateway?
2018-12-28 18:00:16 ERROR (MainThread) [pyotgw.pyotgw] Timed out waiting for command: PR, value: I. Are you connecting to the OpenTherm Gateway?
2018-12-28 18:00:26 ERROR (MainThread) [pyotgw.pyotgw] Timed out waiting for command: PR, value: L. Are you connecting to the OpenTherm Gateway?
2018-12-28 18:00:36 ERROR (MainThread) [pyotgw.pyotgw] Timed out waiting for command: PR, value: M. Are you connecting to the OpenTherm Gateway?
2018-12-28 18:00:46 ERROR (MainThread) [pyotgw.pyotgw] Timed out waiting for command: PR, value: O. Are you connecting to the OpenTherm Gateway?
2018-12-28 18:00:56 ERROR (MainThread) [pyotgw.pyotgw] Timed out waiting for command: PR, value: P. Are you connecting to the OpenTherm Gateway?
2018-12-28 18:01:06 ERROR (MainThread) [pyotgw.pyotgw] Timed out waiting for command: PR, value: R. Are you connecting to the OpenTherm Gateway?
2018-12-28 18:01:16 ERROR (MainThread) [pyotgw.pyotgw] Timed out waiting for command: PR, value: S. Are you connecting to the OpenTherm Gateway?
2018-12-28 18:01:26 ERROR (MainThread) [pyotgw.pyotgw] Timed out waiting for command: PR, value: T. Are you connecting to the OpenTherm Gateway?
2018-12-28 18:01:36 ERROR (MainThread) [pyotgw.pyotgw] Timed out waiting for command: PR, value: V. Are you connecting to the OpenTherm Gateway?
2018-12-28 18:01:46 ERROR (MainThread) [pyotgw.pyotgw] Timed out waiting for command: PR, value: W. Are you connecting to the OpenTherm Gateway?
2018-12-28 18:01:46 ERROR (MainThread) [homeassistant.core] Error doing job: Task exception was never retrieved
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/homeassistant/components/opentherm_gw/__init__.py", line 136, in connect_and_subscribe
    await gateway.connect(hass.loop, device_path)
  File "/usr/local/lib/python3.6/site-packages/pyotgw/pyotgw.py", line 60, in connect
    await self.get_reports()
  File "/usr/local/lib/python3.6/site-packages/pyotgw/pyotgw.py", line 196, in get_reports
    ovrd_mode = str.upper(reports[OTGW_REPORT_SETPOINT_OVRD][0])
TypeError: 'NoneType' object is not subscriptable

This is my configuration part:

opentherm_gw:
  device: socket://192.168.130.50:6638
  climate:
    name: Thermostat2
    precision: 0.5
    floor_temperature: true
  monitored_variables:
    - ch_water_temp
    - control_setpoint
    - dhw_setpoint
    - otgw_about
    - otgw_mode
    - otgw_setback_temp
    - otgw_setpoint_ovrd_mode
    - room_setpoint
    - room_setpoint_ovrd
    - room_temp

This is otmonitor.conf which seems to work fine using the same NodeMCU address/port:

web {
  enable true
  port 84
  nopass true
  sslport 0
  graphlegend false
  theme default
  sslprotocols tls1,tls1.1,tls1.2
  certonly false
}
connection {
  type tcp
  enable true
  port 6638
  host 192.168.130.50
  device {}
}
server {
  enable false
  port 7686
  relay true
}
mqtt {
  enable true
  broker 192.168.130.40
  port 1883
  devicetype central_heating
  deviceid otmonitor
  format raw
  retransmit 10
  qos 1
  keepalive 120
  messages true
  client otgw
  username {}
  password {}
}

Any clue what I am doing wrong or should change? I am using HASSIO version 0.84.6

mvn23 commented 5 years ago

2018-12-28 17:59:45 ERROR (MainThread) [pyotgw.pyotgw] Command PR with value A raised exception: Syntax Error: The command contained an unexpected character or was incomplete. 2018-12-28 17:59:46 ERROR (MainThread) [pyotgw.pyotgw] Command PR with value B raised exception: Syntax Error: The command contained an unexpected character or was incomplete.

These lines indicate that there is an issue where the commands are getting mangled before they reach the gateway. This could indicate a more serious (possibly hardware) issue and could also be the reason why mqtt commands don't work. Please verify that the gateway is behaving as expected when accessing the NodeMCU via telnet and/or follow the troubleshooting from the OpenTherm Gateway website

Apart from that, I assume you completely stopped all other processes that are accessing the NodeMCU before starting Home Assistant with opentherm_gw enabled. If not, please make sure opentherm_gw has the connection all to itself.

The firmware on the NodeMCU may also be causing issues. Unfortunately you did not provide a firmware and version in your post, but ESPEasy claims support from R124 onwards.

After doing the above, the component should initialize without any problems.

EdmarS commented 5 years ago

Thanks for taking the time to answer my comment.

I'm pretty sure my NodeMCU / OTGW combo works fine: using otmonitor it works fine, including changing the setpoint using the web interface. I was only not able to change the setpoint using MQTT, more because I think I was probably using the wrong MQTT topic. On the NodeMCU the latest ESP Easy Mega is loaded: mega-20181220

Telnetting to the NodeMCU gives the following output, which looks fine to me:

telnet 192.168.130.50 6638
Trying 192.168.130.50...
Connected to 192.168.130.50.
Escape character is '^]'.
SE
T90011780
B50011780
T00110000
BC0110000
T80190000
B40193CD7
T00050000
BC00500FF
T80000200
BC0000240
T90011780
B50011780
T00110000
BC0110000
T80190000
B40193CCC
...

This looks similar to the 'messages' output of otmonitor.

I have rebooted all machines including the NodeMCU, made sure no other process is accessing the NodeMCU and restarted HASSIO. The timeout errors have disappeared, but still get the following (slightly different) errors:

2018-12-28 21:26:24 ERROR (MainThread) [pyotgw.pyotgw] Command PR with value A raised exception: Syntax Error: The command contained an unexpected character or was incomplete.
2018-12-28 21:26:24 ERROR (MainThread) [pyotgw.pyotgw] Command PR with value B raised exception: Syntax Error: The command contained an unexpected character or was incomplete.
2018-12-28 21:26:28 ERROR (MainThread) [homeassistant.core] Error doing job: Task exception was never retrieved
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/homeassistant/components/opentherm_gw/__init__.py", line 136, in connect_and_subscribe
    await gateway.connect(hass.loop, device_path)
  File "/config/deps/lib/python3.6/site-packages/pyotgw/pyotgw.py", line 60, in connect
    await self.get_reports()
  File "/config/deps/lib/python3.6/site-packages/pyotgw/pyotgw.py", line 202, in get_reports
    OTGW_GPIO_B: int(reports[OTGW_REPORT_GPIO_FUNCS][1]),
ValueError: invalid literal for int() with base 10: ' '
mvn23 commented 5 years ago

Have you disabled serial logging in ESPEasy?

EdmarS commented 5 years ago

By default logging is disabled in de ESPEasy Mega firmware. Enabling the logging made the NodeMCU unresponsive. I decided to downgrade the firmware to R147. I still get the same errors in the home-assistant.log.

The ESPEasy logging shows a serial buffer full error:


1025740 : Ser2N: serial buffer full!
1025886 : Ser2N: S>: SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE 
1025915 : Ser2N: serial buffer full!
1026061 : Ser2N: S>: E SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE 
1026090 : Ser2N: serial buffer full!
1026236 : Ser2N: S>: E SE SE SE SE SE SE SE SE T90011C99 SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE S
1026288 : WD : Uptime 17 ConnectFailures 0 FreeMem 25576
1026317 : Ser2N: serial buffer full!
1026463 : Ser2N: S>: E SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE 
1026570 : Ser2N: S>: SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE S
mvn23 commented 5 years ago

All these SEs means the gateway is receiving garbage on its serial connection. I have tried to reproduce the issue with a NodeMCU running ESPEasy mega-20181220 and while I was initially having some issues getting it to work (no data being transferred at all between serial and network, only being able to receive data but unable to send commands), it has now been running stable for the last 2 days. Unfortunately I don't know what exactly caused it to work after the initial issues.

EdmarS commented 5 years ago

For now I went back to my original otmonitor/MQTT setup, which does communicate fine with the NodeMCU serial interface, although I still see some serial buffer full! errors in the logging. I will give it another try next week. Anyway, thank you for your work and help!

Martinvdm commented 5 years ago

There have been more reports of lost/bad connectivity when using this platform over a network connection, mostly related to esp8266 devices. I took some time to implement command retry and inactivity reconnect logic in the library that is being used. I have tested it for functionality with syntax errors and lost connections and I have been test driving it in my home setup for more than a day now (without any retries/reconnects) so it should be usable. I would like to test it some more before pushing any updates, but if you want you can try out the new code at mvn23/pyotgw:retry_reconnect.

Do you still have this release? same issue here, but i see HA 0.85 will release pyotgw 0.4b0 and not 0.4b1? is this correct?

mvn23 commented 5 years ago

From the 0.85 rc branch: https://github.com/home-assistant/home-assistant/blob/6fb8378b459b6e7cb25395440795870f62c1b9fe/homeassistant/components/opentherm_gw/__init__.py#L107 0.4b1 includes the retry_reconnect improvements.