norberts1 / hometop_HT3

Pimp your heater.
GNU General Public License v3.0
38 stars 19 forks source link

Sometimes there are drop outs in the readings #18

Closed Heiko-L closed 1 year ago

Heiko-L commented 2 years ago

Describe the bug There are some drop outs especially in the outside temperature readings.

Screenshots Unterbrechungen

Additional context The readings from the solar system do not suffer from drop outs.

norberts1 commented 2 years ago

Let me ask you some questions:

  1. Which hardware you are using (ht_pitiny/ht_piduino or one of that passive adapters)?
  2. Which interface you are using (sqlite-db or mqtt messages)? Your screen-shot is not from rrdtool-grafic, so you are using perhaps HomeAssistent etc.

That informations are required for me to localize that softwarepart responsible for that behaviour. Hopefully you are running that mqtt- and that sqlite-Interface in parallel. Then you can check that relevant outside-value in the sqlite-database.

Heiko-L commented 2 years ago

You are right, the screenshots are from home assistant.

I'm running a headless rpi 4 with dtoverlay=uart4 and enable_uart=0 in config.txt. As hardware interface I'm using HT3_USB_MicroAdapter. I disabled both databases and I'm using direct async port connection. Only Mqtt and home assistant "interface".

Heiko-L commented 2 years ago

Update: I changed from ASYNC to SOCKET and I enabled the sqlite DB. I can also tunnel the socket through ssh so that I can use HT3_Systemstatus.py and HT3_Analyser.py on a remote machine.

Heiko-L commented 2 years ago

Update2: dropouts are also in HT3_db.sqlite- these are the hexdumps where "T_aussen" drops to 0:

25_0 :HG :90 00 19 00 30 00 b0 00 21 00 10 00 90 00 29 00 30 00 b0 00 88 00 07 00

25_0 :HG :90 00 19 00 30 00 b0 00 21 00 29 00 10 00 90 00 30 00 b0 00 31 00 39 00 10 00 90 00 30 00 b0 00 ff 00

7_0 :HG :88 00 07 00 03 01 00 00 00 01 00 00 00 00 00 00 00 00 00 df 00

36_0 :HG :90 00 24 00 30 00 b0 00 2c 00 10 00 90 00 34 00 30 00 b0 00 3c 00 44 00 10 00 90 00

22_0 :HG :88 10 16 00 ff 32 ec 00

25_0 :HG :90 00 19 00 30 00 b0 00 21 00 29 00 10 00 90 08 35 01 00

36_0 :HG :90 00 24 00 30 00 b0 00 2c 00 10 00 90 00 34 00 30 00 b0 00 3c 00 44 00 10 00 90 00

22_0 :HG :88 10 16 00 ff 32 ec 00

7_0 :HG :88 00 07 00 03 01 00 00 00 01 00 00 00 00 00 00 00 00 00 df 00

25_0 :HG :90 00 19 00 30 00 b0 00 ff 00 00 03 00 00 00 00 00 8f 01 b1 00 00

7_0 :HG :88 00 07 00 03 01 00 00 00 01 00 00 00 00 00 00 00 00 00 df 00

norberts1 commented 2 years ago

That message-ID for "T_aussen" is: 25_0 and in that faulty cases the value at bytes 4&5 (3000)hex is outside the valid range: 0 ... 100 degrees. Reason is the wrong detection of the message, the breaking end of message-stream (BREAK-signal) isn't detected correct. This occurs mainly with that passive ht_adapters. They haven't any serial BREAK-signal detection. That serial driver translates that BREAK-signal always to Zero := 0.

One of that above sequence: 25_0 :HG :90 00 19 00 30 00 b0 00 21 00 10 00 90 00 29 00 30 00 b0 00 88 00 07 00 is then (with BREAKs): 90 <BREAK> <<-- Response from Fxyz/Cxyz-modul, no new data available 19 <BREAK> <<-- No module available, cause no answer from polling 30 <BREAK> <<-- Polling-byte to solar-modul (ISM/MS) b0 <BREAK> <<-- Response from solar-modul (ISM/MS) with no new data .... Currently I always check that starting byte, the message-ID, the CRC-byte and the terminating byte zero:=0 (BREAK) on every message for validity. But this is not always enough to detect a valid message-stream with that passive ht_adapters. I'm not really lucky with this behavior and I'll try to increase the robustness of the software for message-stream detection. The active adpaters: ht_pitiny and ht_piduino don't have this problem, cause they have that BREAK-signal detection included. For test purposes you can modify the default/maxvalue in that configfile: HT3_db_cfg.xml to other values like: <logitem name="T_aussen"> <datatype>REAL</datatype> <datause>GAUGE</datause> <maxvalue>100.0</maxvalue> <default>20.0</default> <unit>Grad</unit> <displayname>T-Aussen</displayname> <accessname>ch_Toutside</accessname> <!-- name for data-exchange with interfaces --> </logitem>

As the result the outside-temperature will be set to 20 degree for that faulty messages (after config-modification restart the ht_collgate.py process). I will check the software for this case like modifications on 'blacklist', 'blacksequence', using the preceding values etc.

Heiko-L commented 2 years ago

As the result the outside-temperature will be set to 20 degree for that faulty messages (after config-modification restart the ht_collgate.py process). I will check the software for this case like modifications on 'blacklist', 'blacksequence', using the preceding values etc.

So "25_0 :HG :90 00 19 00 30 00 b0 00 21 00 10 00 90 00 29 00 30 00 b0 00 88 00 07 00" is a message with valid CRC? That is very disadvantageous...

Maybe discarding the whole message when detecting a value outside the valid range could be a solution?

norberts1 commented 2 years ago

@Heiko-L ... a message with valid CRC? Yes ps: hexdata stored in sqlite-db are raw-data without checking and rejecting of min/max values.

But anyway, the modul_byte (90)hex := modul_ID (10)hex <<--- controller Fxzy/Cxyz should not send outside-tempvalues. The outside-sensor is connected to the heatercontroller and this attached value should be send with (88)hex := modul ID (08)hex like this string: 25_0 :HG :88 00 19 00 00 76 80 00 80 00 ff ff 00 41 00 35 c5 0b 96 19 00 00 00 0a 85 0f 00 1c de 80 00 54 00 @Heiko-L Maybe discarding the whole message when detecting a value outside the valid range could be a solution? No not really. There are different heating-/heater-systems outside there configured with different temp-sensors or other sensors. Some of them have others haven't sensors build in, but using the common messages for communication. The values of not available sensors then are set to: (8000)hex or (7fff)hex and this is also out of a valid range.

What I'm looking for is to suppress this wrong message in the software. This is already done for different invalid messages but must be extended for this wrong message.

Hopefully you are using the latest software-version from the project and running the ht_proxy.py daemon too. Then it's easy to generate some heaterbus-data as a raw-data logfile. This one or them you can push here and I can analyse the data and check my software-changes for validity. To create that logfile you have to do:

  1. stepping to folder: ~/HT3/sw on your RPi.
  2. start logging with: ./ht_binlogclient.py ./var/log/yourlogfilename1.log That logfile should be generated for less then 2 hours for size-reasons. You can start that logging twice or more with timedelay to get the relevant faulty message in the logfile. Your RPi4 has enough power for doing this :-)
Heiko-L commented 2 years ago

I will record a binary log on a USB stick tonight.

Heiko-L commented 2 years ago

Here is the binary log from last night: binlogclient.log

The system consists of:

ISM1 has one NTC connected to the solar collectors and one NTC connected to the bottom of the SKE 400. The ZSB14 has one NTC connected to the top of the SKE 400. ZSB14's hydraulic is directly connected to the heating circuit and SKE 400 using its internal 3-way valve.

No other external mixers, pumps, valves or anything, just the bare minimum ;)

norberts1 commented 2 years ago

I have made some analyzing with your logfile. To fix that faulty handling for this issue and #20 you have to modify that sw-modul: ./lib/ht_discode.py If you have the latest sw-release then modify that black_sequence = {} at lines 3443 until 3452: black_sequence = { 0: [9, 0, 0x89, 0, 0x30, 0, 0xb0, 0, 9, 0, 0x89, 0], 1: [9, 0, 0x89, 0], 2: [0x20, 0, 0xa0, 0, 0x21, 0, 0xa1, 0, 0x22, 0, 0xa2, 0], 3: [0x20, 0, 0xa0, 0, 0x21, 0, 0xa1, 0], #4: https://github.com/norberts1/hometop_HT3/issues/20 4: [0x90, 0, 0x1b, 0, 0x30, 0], #5: https://github.com/norberts1/hometop_HT3/issues/18 5: [0x90, 0, 0x19, 0, 0x30, 0], }

Please check it in your system, with that HT3_Analyser.py it works so far.

Heiko-L commented 2 years ago

I'm on git master, but only had "0:" and "1:" at line 3273 in ht_discode.py- nevertheless I added "2:" to "5:" and restartet collgate. I'll report.

Thank you!

Heiko-L commented 2 years ago

Looks very promising- no dropouts until now.

norberts1 commented 1 year ago

18 fixed.