zxdavb / ramses_cc

HA integration for CH/DHW and HVAC systems that use the RAMSES II RF protocol
GNU General Public License v3.0
71 stars 16 forks source link

0.31.16 Problem with the serial port: Transport did not initialise #182

Closed jrb80 closed 2 months ago

jrb80 commented 5 months ago

Integration will not start after reboot, HA must be bounced several times until it initialises. Often a warm boot is required. Issue was not observed on earlier releases.

configuration.yaml

ramses_cc:
  serial_port: /dev/serial/by-id/usb-SparkFun_evofw3_atmega32u4-if00

Extract from log file

2024-04-03 21:55:55.042 ERROR (MainThread) [custom_components.ramses_cc] There is a problem with the serial port: Transport did not initialise successfully
2024-04-03 21:55:55.044 ERROR (MainThread) [homeassistant.setup] Setup failed for custom integration 'ramses_cc': Integration failed to initialize.

It could be related to code change in 0.30.5 3d2f52e ramses_cc will fail early if serial port is mis-configured

zxdavb commented 5 months ago

Can your describe your system for me, for example:

zxdavb commented 5 months ago

See also #182

jrb80 commented 5 months ago

@zxdavb my suspicions are it's some type of timing issue or conflict with another integration during the reboot/initialisation process. I have two identical systems (in different locations) which the same behaviour. ramses_cc is stable once it successfully initialises.

  • are you using a VM

Raspberry Pi 4 (x64) on Home Assistant Operating System

  • which dongle are you using - from IndaloTech / from other?

Indalo-Tech SSM-D2 (ssm-32u4d2)

  • which type/version of HA are you using?

HA Core 2024.4.2 / HA Supervisor 2024.04.0 / HA Operating System 12.1

Please clarify exactly what you mean by a warm boot?

Settings => Three dots menu => Restart Home Assistant => Advanced options => Reboot system

zxdavb commented 5 months ago

OK, your use-case should definitely work out-of-the box. Lemme see...

zxdavb commented 4 months ago

@jrb80 Are you still using 0.31.16 (configuration.yaml) or have you moved to 0.41.16 (config flow).

If the latter, please try release 0.41.17.

Otherwise, please try release 0.31.17. If that doesn't work, you may want to consider upgrading to 0.41.17 and trying that.

Let me know how you get on.

jrb80 commented 4 months ago

@jrb80 Are you still using 0.31.16 (configuration.yaml)

Thanks for looking into this! I just upgraded to 0.31.17 (configuration.yaml) and the integration now survives a reboot.

On restart, I now get the following log error on restart which could indicate the underlying problem. There are few other warnings but this is the only error.

Failed to send 2349|RQ|01:255872|03: <ProtocolContext state=WantRply echo=1100|RQ|13:240154, tx_count=4/4>: Send buffer overflow
Failed to send 3EF1|RQ|13:182591: <ProtocolContext state=WantRply echo=1100|RQ|13:240154, tx_count=4/4>: Send buffer overflow
Failed to send 3EF1|RQ|13:088711: <ProtocolContext state=WantRply echo=1100|RQ|13:240154, tx_count=4/4>: Send buffer overflow
Failed to send 3EF1|RQ|13:240154: <ProtocolContext state=WantRply echo=1100|RQ|13:240154, tx_count=4/4>: Send buffer overflow
Failed to send 3EF1|RQ|13:167117: <ProtocolContext state=WantRply echo=1100|RQ|13:240154, tx_count=4/4>: Send buffer overflow
zxdavb commented 4 months ago

On restart, I now get the following log error on restart which could indicate the underlying problem. There are few other warnings but this is the only error.

tx_count=4/4>: Send buffer overflow

Yes/No. The above is merely a symptom of a deeper problem. What are the first errors you're getting?

I am beginning to suspect there is a high error rate between your dongle and the rest of the RF network.

If you, so may be better off disabling QoS.

ramses_cc:
  ramses_rf:
    disable_qos: true
jrb80 commented 4 months ago
ramses_cc:
  ramses_rf:
    disable_qos: true

QoS now disabled. HA log extract from boot-up (I can send full logs on DM if preferred).

RQ --- 18:000730 01:154305 --:------ 30C9 001 00 < QoS is currently disabled by this Protocol
W --- 22:017762 01:154305 --:------ 22C9 006 0007D009F601 < PacketInvalid( W --- 22:017762 01:154305 --:------ 22C9 006 0007D009F601 < Unexpected code for dst to Rx)
Overwrote dtm in index for 30C9| I|01:154305: I --- 01:154305 --:------ 01:154305 2309 018 0001F40101F40201F40301F40401F40501F4
Overwrote dtm in index for 0016|RP|01:154305|02: RQ --- 22:033088 01:154305 --:------ 0016 001 02
Error doing job: Exception in callback PortProtocol.connection_made(PortTransport...0x7f72080b90>), ramses=True)()
Setup failed for custom integration 'ramses_cc': Integration failed to initialize.
Unrecoverable problem with the serial port: Transport did not bind to Protocol within 5 secs
jrb80 commented 4 months ago

@jrb80 Upgraded to 0.31.17 (configuration.yaml) and the integration now survives a reboot.

Unfortunately 0.31.17 continued to fail on reboot. I have since rolled back to 0.31.16 since the x.17 series was pulled. However, I did note this new error before rolling back.

Unrecoverable problem with the serial port: Transport did not bind to Protocol within 5 secs
jrb80 commented 4 months ago

@zxdavb fyi, I've rolled back one of my systems to v0.21.40 and it restarts without a hitch. I suspect the refactoring after this point has something to do with the serial issue? I will continue to test the system on v0.21.40 to confirm this is true.

zxdavb commented 4 months ago

I am not sure if 0.x.20 will solve your problem, but some work has been done in that area.

The issue is that I cannot come up with a reliable way. Try the new version & let me know.

jrb80 commented 4 months ago

I am not sure if 0.x.20 will solve your problem, but some work has been done in that area.

Thank you! I've upgraded both systems to x.20 and so far no startup failures, I will continue to monitor over the coming weeks.