tspopp / AquaMQTT

Monitor and control your Groupe Atlantic (Explorer, Aquawin,...) heat pump using MQTT
Apache License 2.0
21 stars 8 forks source link

feat(protocol): support heatpumps with new protocol E/E #52

Open tspopp opened 3 weeks ago

tspopp commented 3 weeks ago

Implementing another heat pump protocol

Help

:warning: This PR needs help by the community, as I don't have a machine with the new protocol :warning:

Missing Features / Open Tasks:

Status Quo:

This branch contains a modified version of AquaMQTT which is meant to be installed in LISTENER mode. It is currently able to identify hmi, main and energy messages from the new protocol #45. These are provided to mqtt on the three topics:

aquamqtt/hmi/debug aquamqtt/main/debug aquamqtt/energy/debug

The new protocol findings are documented within PROTOCOL_NEXT.md

Tracing

Using the debug python script https://github.com/tspopp/AquaMQTT/blob/main/tools/debug.py you are able to record changing messages over time and identify which location holds what kind of attribute.

Create the python environment

python3 -m venv venv
source venv/bin/activate
pip install paho-mqtt

Run the python script

source venv/bin/activate
python3 debug.py

How to help here?

tspopp commented 3 weeks ago

@taloriko I've already added parsing of time and date from the hmi message. You may want to check if the values are correct.

taloriko commented 3 weeks ago

Okay, that was my first PR that I used, but it seems to be working quite well.

These new topics were generated:

grafik

Check Time and Date:

HMI set to 17 Clock --> OK grafik

HMI set to 16 Clock --> OK grafik

HMI set to 04 November --> Day OK | Month --> NOK | Year --> OK grafik

I initially tried to locate the bytes for HMI debugging. At first glance, the byte positions are similar to the previous ones. Here is the initial list of what I was able to reproduce on the HMI in a short amount of time.

HMI Debug

  1. Byte ????

  2. Byte --> target temperature -->HEX • 40 in Hex: 28 • 41 in Hex: 29 • 42 in Hex: 2A • 43 in Hex: 2B • 44 in Hex: 2C

In this dump, I set the target temperature from 40°C to 44°C in individual increments:

2228110000001000062C01D0020000000B44312310000000004E450000060422013E 2228110000001000062C01D0020000000C44312310000000004E450000060422013E 2228110000001000062C01D0020000000D44312310000000004E450000060422013E 2228110000001000062C01D0020000000E44312310000000004E450000060422013E 2228110000001000062C01D0020000000F44312310000000004E450000060422013E 2228110000001000062C01D0020000001044312310000000004E450000060422013E 2229110000001000062C01D0020000001144312310000000004E450000060422013E 2229110000001000062C01D0020000001244312310000000004E450000060422013E 2229110000001000062C01D0020000001344312310000000004E450000060422013E 222A110000001000062C01D0020000001444312310000000004E450000060422013E 222A110000001000062C01D0020000001544312310000000004E450000060422013E 222A110000001000062C01D0020000001644312310000000004E450000060422013E 222B110000001000062C01D0020000001644312310000000004E450000060422013E 222B110000001000062C01D0020000001744312310000000004E450000060422013E 222B110000001000062C01D0020000001844312310000000004E450000060422013E 222B110000001000062C01D0020000001944312310000000004E450000060422013E 222C110000001000062C01D0020000001944312310000000004E450000060422013E 222C110000001000062C01D0020000001A44312310000000004E450000060422013E 222C110000001000062C01D0020000001B44312310000000004E450000060422013E 222C110000001000062C01D0020000001C44312310000000004E450000060422013E

  1. Byte --> Operating mode

10 = timer operation | Auto 11 = timer operation | Eco/Manuell Eco=Aktiv 12 = timer operation | Eco/Manuell Eco=Inaktiv 15 = timer operation | Absence 19 = timer operation | Boost 40 = continuous operation | Auto 41 = continuous operation | Eco/Manuell Eco=Aktiv 42 = continuous operation | Eco/Manuell Eco=Inaktiv 45 = continuous operation | Absence 49 = continuous operation | Boost

In this dump, I set the Operating mode with "timer operation" from Auto --> Eco/Manuell Eco=Inaktiv --> Eco/Manuell Eco=Aktiv

2200100000001000062C01D0020000002F44313510000000004E450000060422013E 2200100000001000062C01D0020000003044313510000000004E450000060422013E 2200100000001000062C01D0020000003144313510000000004E450000060422013E 222B120000001000062C01D0020000003244313510000000004E450000060422013E 222B120000001000062C01D0020000003344313510000000004E450000060422013E 222B120000001000062C01D0020000003444313510000000004E450000060422013E 222B120000001000062C01D0020000003544313510000000004E450000060422013E 222B120000001000062C01D0020000003644313510000000004E450000060422013E 222B120000001000062C01D0020000003744313510000000004E450000060422013E 222B120000001000062C01D0020000003844313510000000004E450000060422013E 222B120000001000062C01D0020000003944313510000000004E450000060422013E 222B110000001000062C01D0020000003944313510000000004E450000060422013E 222B110000001000062C01D0020000003A44313510000000004E450000060422013E 222B110000001000062C01D0020000003B44313510000000004E450000060422013E 222B110000001000062C01D0020000000044313610000000004E450000060422013E

If the mode is set to auto, the target temperature is 00, and I cannot see any on the HMI either.

  1. Byte --> air connection / anti-legionella

0X --> recirculated air 1X --> one air connection 2X --> two air connection X0 --> anti-legionella --> Inaktiv X1 --> anti-legionella --> 1 Month Cycle X2 --> anti-legionella --> 2 Month Cycle X3 --> anti-legionella --> 3 Month Cycle X4 --> anti-legionella --> 4 Month Cycle

  1. Byte --> emergency heating

00 --> inactive 01 --> active

  1. Byte --> heating element

00 --> Automatic 04 --> locked 06 --> PV System <> Yes

I hope the initial steps are okay for you.

tspopp commented 3 weeks ago

Yes, looks great so far. I will incorporate the findings later. One more hint, if you use the python script, you will get dumps in hex and dec representation. It is WAY easier to spot the changed attributes. For example:

e.g. in dec

2024-11-04 18:39:09.587725,34 50 18 0 0 0 16 0 6 44 1 208 2 0 0 0 0 33 30 13 12 0 0 0 0 78 69 0 0 6 4 34 1 62
2024-11-04 18:39:10.135443,34 50 18 0 0 0 16 0 6 44 1 208 2 0 0 0 2 33 30 13 12 0 0 0 0 78 69 0 0 6 4 34 1 62

or in hex

2024-11-04 18:39:09.587725,2232120000001000062C01D00200000000211E0D0C000000004E450000060422013E
2024-11-04 18:39:10.135443,2232120000001000062C01D00200000002211E0D0C000000004E450000060422013E

I think you've missed Operation Mode BOOST and Absence/Vacation. If we have these, we have the completed the Operation Modes :tada:

tspopp commented 3 weeks ago

I updated the implementation based on your findings. Please note PROTOCOL_NEXT.md . You might also try to edit that file directly or add comments directly to the document by reviewing this PR. This would be actually a nice way implementing this :grimacing:

It might be helpful to also have a look on the pre-existing PROTOCOL.md since I expect that there will be a lot of similarities :)

taloriko commented 3 weeks ago

I can't edit the PROTOCOL_NEXT.md file directly. I'm using GitHub for the first time and hope that the PR Patch 1 is suitable.

I'm currently stuck with the timer. The changes start counting from byte 0 at index 9.

I have a few examples:

Timer 07:00-12:00 | 16:00-22:00 | 11h

34 48 18 0 0 0 48 0 6 192 3 124 1 0 0 0 34 69 49 47 16 0 0 0 0 78 69 0 0 164 1 44 1 62

Timer 08:00-12:00 | 16:00-22:00 | 10h

34 43 17 0 0 0 16 0 6 192 3 104 1 0 0 0 7 68 49 3 18 0 0 0 0 78 69 0 0 224 1 240 0 62

Timer 09:00-13:00 | 17:00-23:00 | 10h

34 43 17 0 0 0 16 0 6 252 3 104 1 0 0 0 58 68 49 8 18 0 0 0 0 78 69 0 0 28 2 240 0 62

Timer 01:00-05:00 | 10:00-18:00 | 12h

34 48 18 0 0 0 48 0 6 88 2 224 1 0 0 0 54 69 49 27 16 0 0 0 0 78 69 0 0 60 0 240 0 62

Timer 01:00-06:00 | 10:00-18:00 | 13h

34 48 18 0 0 0 48 0 6 88 2 224 1 0 0 0 12 69 49 35 16 0 0 0 0 78 69 0 0 60 0 44 1 62

Timer 00:00-06:00 | 10:00-18:00 | 14h

34 48 18 0 0 0 48 0 6 88 2 224 1 0 0 0 38 69 49 37 16 0 0 0 0 78 69 0 0 60 0 44 1 62
taloriko commented 3 weeks ago

To narrow down the results more precisely, I only changed the beginning of the first time window and observed the respective outcome.

Byte 9 could represent the minutes since 00:00. Up to 4:00, the minutes fit into the first byte. At 5:00, Byte 10 comes into play: 44 and 1 → (256x1) + 44 = 300 / 60 = 5 At 6:00, however, it no longer fits: (256x3) + 192 = 960 / 60 = 16

From 6:00, it looks like it represents minutes until midnight.

At 7:00, it doesn’t fit anymore – or am I calculating it incorrectly?

Starting from 6 a.m., it seems that Byte 9 and 10 switch with Byte 29 and 30, and thus the time windows change. Or my assumption may be incorrect.

00:00-12:00 | 16:00-22:00 | 18h,22,48,18,0,0,0,48,0,2,0,0,208,2,0,0,0,40,70,49,54,16,0,0,0,0,78,69,0,0,192,3,104,1 01:00-12:00 | 16:00-22:00 | 17h,22,48,18,0,0,0,48,0,2,60,0,148,2,0,0,0,7,70,49,55,16,0,0,0,0,78,69,0,0,192,3,104,1 02:00-12:00 | 16:00-22:00 | 16h,22,48,18,0,0,0,48,0,2,120,0,88,2,0,0,0,47,70,49,55,16,0,0,0,0,78,69,0,0,192,3,104,1 03:00-12:00 | 16:00-22:00 | 15h,22,48,18,0,0,0,48,0,2,180,0,28,2,0,0,0,34,70,49,56,16,0,0,0,0,78,69,0,0,192,3,104,1 04:00-12:00 | 16:00-22:00 | 14h,22,48,18,0,0,0,48,0,2,240,0,224,1,0,0,0,15,70,49,57,16,0,0,0,0,78,69,0,0,192,3,104,1 05:00-12:00 | 16:00-22:00 | 13h,22,48,18,0,0,0,48,0,2,44,1,164,1,0,0,0,57,70,49,57,16,0,0,0,0,78,69,0,0,192,3,104,1 06:00-12:00 | 16:00-22:00 | 12h,22,48,18,0,0,0,48,0,2,192,3,104,1,0,0,0,43,70,49,58,16,0,0,0,0,78,69,0,0,104,1,104,1 07:00-12:00 | 16:00-22:00 | 11h,22,48,18,0,0,0,48,0,2,192,3,104,1,0,0,0,30,70,49,59,16,0,0,0,0,78,69,0,0,164,1,44,1 08:00-12:00 | 16:00-22:00 | 10h,22,48,18,0,0,0,48,0,2,192,3,104,1,0,0,0,24,70,49,0,17,0,0,0,0,78,69,0,0,224,1,240,0

Timer 00:00-04:00 | 05:00-09:00 | 8h 34 44 18 0 0 0 48 0 2 44 1 240 0 0 0 0 48 71 49 52 15 0 0 0 0 78 69 0 0 0 0 240 0 62

Timer 01:00-05:00 | 06:00-11:00 | 9h 34 44 18 0 0 0 48 0 2 104 1 44 1 0 0 0 42 71 49 8 16 0 0 0 0 78 69 0 0 0 0 44 1 62

Timer 01:00-05:00 | 06:00-14:00 | 12h 34 44 18 0 0 0 48 0 2 104 1 224 1 0 0 0 36 71 49 14 16 0 0 0 0 78 69 0 0 60 0 240 0 62

Timer 01:00-09:00 | XXXXXXX | 8h 34 44 18 0 0 0 48 0 2 60 0 224 1 0 0 0 8 71 49 22 16 0 0 0 0 78 69 0 0 104 1 224 1 62

Time window: Setting limitations • One time window: Minimum duration of 8 hours, maximum duration of 12 hours. • Two time windows: Each at least 4 hours, combined maximum of 20 hours. • Disable the second time window: Set the end time to less than 4 hours.

tspopp commented 3 weeks ago

No worries, we can focus on other attributes first. I will try to look in these dumps as soon as I have some time for this. Alternatively, if you have a formula which seems promising, we can try to implement it and see if it matches...

taloriko commented 3 weeks ago

I’ve now tested everything I could find on the HMI (Menu + Installer Menu). I couldn’t identify any further details.

I’ve added the time windows, though I’m still not entirely certain about their full functionality.

Here are my observations:

Bytes 3, 14, 15, 23, 24, 27, and 28 did not change during testing and were always set to 0.
Bytes 25 and 26 consistently held the values 78 and 69 (I have no idea).
The function of Byte 33 is still unclear (possibly a checksum).

Do you have any further ideas on how we could test these bytes? Otherwise, I would wrap up the HMI testing and shift focus to energy management.

tspopp commented 3 weeks ago

I think initially you have identified all major attributes of interest in the hmi message :beers: A few of these leftover bytes might be reserved for commands. Checksum is the last Byte 34, which is not available in the dumps by AquaMQTT, so it is likely that there is also something else stored in Byte 33. You may try to enter the secret menu "spin the wheel left and then to the right" and you get some more advanced options. You may reverse these items as well, but AquaMQTT currently does not implement commands. I think I documented how commands work in my protocol document, not sure if they changed the pattern with the new protocol :man_shrugging:

But changing this advanced settings is somehow interesting, because you're changing the main message and you able to identify more information from the main message. For example, if you set the fan speed level to 55%, you will see some value changing within the main message to 55% :wink:

In the meantime I will implement the changes to speak both protocol version at the same time, so we can merge this to main as soon as we feel like it's ready.

At some point in time I need you to provide a large (maybe a few minutes) raw dump using AquaDebug. Only the logs from AquaDebug contains the checksum and we need to find out, how the crc values are actually calculated. In MITM mode we need to recreate messages and therefore have to generate the checksum the same way. So understanding this pattern is crucial for getting MITM to work :grimacing:

taloriko commented 3 weeks ago

In the secret menu, I can only read.

Should the heat pump be running during the dump?

Screenshot_20241107_182233_Gallery.jpg

Screenshot_20241107_182212_Gallery.jpg

Screenshot_20241107_182312_Gallery.jpg

Screenshot_20241107_182322_Gallery.jpg

Screenshot_20241107_182244_Gallery.jpg

Screenshot_20241107_182253_Gallery.jpg

Screenshot_20241107_182303_Gallery.jpg

Screenshot_20241107_182222_Gallery.jpg

Screenshot_20241107_182151_Gallery.jpg

tspopp commented 3 weeks ago

No it does not need to run during the dump. There will be a lot of hmi messages and they change checksum frequently since the time is always changing :)

But while dumping, you may also open the secret menu one time. We should see how error messages look like. We need to identify them for MITM, too.

taloriko commented 2 weeks ago

Today, I wanted to analyze the next set of data, and I noticed that the ESP32 restarts every 10 seconds.

I have always received the data so far, but there were occasional dropouts. I initially thought that IP-Symcon couldn’t receive the topics quickly enough, which caused these gaps.

First, I searched for a timeout issue and found one in the WiFi settings. I adjusted the value to 600:

constexpr uint16_t WIFI_RECONNECT_CYCLE_S = 600;

However, this change did not fix the problem.

I’m not sure if it’s related, but I also had to adjust the code slightly to be able to flash without errors:


#ifdef CUSTOM_CONFIGURATION
#    include "ExampleConfiguration.h"
#else
#    include "ExampleConfiguration.h"
#endif

I replaced CustomConfiguration with a different configuration.

tspopp commented 2 weeks ago

Most probably I broke something with my latest commits. I will check. Sorry for that 😅

Edit: with my simulation it is not crashing, so its probably related to the protocol :man_shrugging: , but I fixed something which could lead to the crash. It is most likely memory corruption, since I've refactored a lot of stuff with the support of both protocols in parallel. In any way, the large dump of AquaDebug (while opening the super secret menu) would be really helpful. Once I have that, I can simulate your heatpump, find out how the error message looks like and can begin trying to figure out how the checksum works.

If it is still crashing with latest commit, you can go back to the one where I did not refactored the serial protocols via git checkout b55b3b2 As soon as this branch is in a state where I can put in on my device, I will test it myself on my device. We will get those things sorted... :+1:

Looking forward!

taloriko commented 2 weeks ago

I have finally created the log files with AquaDebug.

I created various files:

Without opening a menu
"Super Secret Menu" opened
Browsed errors in the "Super Secret Menu"
"Installer Menu" opened

aqua_debug_data_Normal.txt aqua_debug_data_SuperSecret.txt aqua_debug_data_SuperSecret_Errors.txt aqua_debug_data_Installer.txt

tspopp commented 2 weeks ago

Nice, we found the error message :tada: Here is an example:

4A410D0F2D00270015001800130000000000000000020000000000671D00000200000045110000581100007A2200000800000002000000211E000C003200000000086B

I will add debug topics for those, but you don't need to figure out what the bytes in this message actually mean (of course you can try, if you like). AquaMQTT is currently mostly interested in forwarding these messages to the HMI controller in MITM mode and therefore we have to know how they look like.

error_request_sequence.txt

This is how it works:

- messages without checksum
- hmi message requests error message with request id 0D
- controller answers with 4A4A0D (identifier + length + requestId)
- pattern is repeated with increased requestId

C2222E120000003000062C0126020000001C4B310A15000000004E4501 0D 6603C2013E
4A41 0D 0F2D00270015001800130000000000000000020000000000671D00000200000045110000581100007A2200000800000002000000211E000C00320000000008
C2222E120000003000062C0126020000001C4B310A15000000004E4502 0E 6603C2013E
4A41 0E 0F2C00270016001900130000000000000000020000000000661D0000020000004411000057110000792200000800000002000000211E000C00320000000007
C2222E120000003000062C0126020000001D4B310A15000000004E4503 0F 6603C2013E
4A41 0F 0F2A00260013001200110000000000000000020000000000611D0000020000004011000053110000712200000800000002000000211E000C00320000000006
C2222E120000003000062C0126020000001D4B310A15000000004E4504 10 6603C2013E
4A41 10 0F2D002800110010001100000000000000000200000000005B1D0000000000003B1100004E110000682200000700000002000000211E000C00320000000005
C2222E120000003000062C0126020000001E4B310A15000000004E4505 11 6603C2013E
4A41 11 0F2A003D00080008001000000000000000000200000000003D1D0000000000002211000028110000362200000600000000000000211E000C01320000000004
C2222E120000003000062C0126020000001F4B310A15000000004E4506 12 6603C2013E
4A41 12 0F2D00260017001700140000000000000000020000000000391D0000000000001F11000025110000312200000600000000000000211E000C01320000000003

- at the end, hmi controller knows somehow that all errors have been received and sets requestId to 00 which leads to no more error messages

C2222E120000003000062C0126020000001F4B310A15000000004E4500 00 6603C2013E

So we have also identified the error requestId in the hmi message, and the requestId in the error message. Enough for today

tspopp commented 2 weeks ago

Based on the dumps I figured out how the checksum is generated. Moreover I added error message parsing and passing. You might want to try MITM mode now and see if it works. As soon as you removed the passthrough jumper, and changed the configuation to constexpr EOperationMode OPERATION_MODE = EOperationMode::MITM and flashed, you may want to the test the following items to see if this is good:

Be aware that overrides / controls are not yet supported for the next protocol, if the above items are working we add the functionality for sure :)

taloriko commented 2 weeks ago

The values I receive via MQTT seem correct at first glance. I can see the values on the HMI and control various components in test mode. In the super-secret menu, I can read errors.

Unfortunately, the controller restarts every 5–12 seconds.

Here are the new topics: {42F4341A-FB6A-4E7A-872D-922A4A58C0D2}

{A02EED78-E47D-4FB4-9AED-A9B4655B13BD}

{A4F462C9-FF74-49D0-BD3B-A4C1CE08D56C}

tspopp commented 2 weeks ago

Actually, these are very good news, happy to get your heat pump fully supported soon :+1: I'll do the refactoring soon and resolve the crash you have. In the meantime you should go back to listener mode, because crashes during MITM might lead in protocol dropouts. Not sure if the heat-pump likes that behavior or not :man_shrugging: You might want to proceed with completing the protocol, observing states in the main message is actually nice. These are the icons shown in the hmi controller (fan is on, heat element is on, external boiler is on...). But of course we can proceed to add things step by step after the main PR has been merged.

taloriko commented 2 weeks ago

Okay, I’ll switch back to LISTENER mode.

As time permits, I’ll continue analyzing the main message.

tspopp commented 2 days ago

I enabled overrides for the new protocol (MITM only). To test this publish some values to the MQTT control topic. For example:

On aquamqtt/ctrl/operationMode submit the value BOOST or value ABSENCE or an empty string to remove the override. I implemented all the overrides which are available in the legacy mode, too. Even the PV override mode shall work now. See more here.

Edit: Or use the device which should appear in HomeAssistant (if you use HomeAssistant :grinning: )

taloriko commented 2 days ago

I enabled overrides for the new protocol (MITM only). To test this publish some values to the MQTT control topic. For example:

On aquamqtt/ctrl/operationMode submit the value BOOST or value ABSENCE or an empty string to remove the override. I implemented all the overrides which are available in the legacy mode, too. Even the PV override mode shall work now. See more here.

Edit: Or use the device which should appear in HomeAssistant (if you use HomeAssistant 😀 )

No, I don’t use HomeAssistant. I’m using IP-Symcon instead. Unfortunately, it’s not as convenient to set up, but we’ll manage 😆.

I briefly tested with the setup from 4 PM.

MITM is working.

Unfortunately, I don’t have much time for testing this week, but I quickly switched $root/ctrl/operationMode between "BOOST" and "MAN ECO ON."

The mode was correctly changed, and as a result, the "overrides" topic was set to 1.

How should I proceed with testing? Should I check all overrides?

tspopp commented 2 days ago

Yeah, if overrides are working, we are almost complete. You may just use it for a while and see if there are some serious issues before we merge this to main. Of couse you can try all of the overrides and see if it works as it should. Very nice :+1:

What's left is identifying more energy attributes. You may want to provide a long-time trace of the energy message while your heat pump is running / heating up. The idea is, that we identify the byte locations, where the value is rising over time. Once identified, I will add something to publish them to MQTT as "unknownCounterValue1" and "unknownCounterValue2" etc. By visualizing those values you might get an idea what they are.

Silmo commented 6 hours ago

Yeah, if overrides are working, we are almost complete. You may just use it for a while and see if there are some serious issues before we merge this to main. Of couse you can try all of the overrides and see if it works as it should. Very nice 👍

What's left is identifying more energy attributes. You may want to provide a long-time trace of the energy message while your heat pump is running / heating up. The idea is, that we identify the byte locations, where the value is rising over time. Once identified, I will add something to publish them to MQTT as "unknownCounterValue1" and "unknownCounterValue2" etc. By visualizing those values you might get an idea what they are.

I have an Atlantic 270/V3. Attached are the logs for a heating run from 46c to 51c.

Let me know if you need anything else. aquamqtt_hmi_debug_dec.csv aquamqtt_hmi_debug_hex.csv aquamqtt_main_debug_dec.csv aquamqtt_main_debug_hex.csv aquamqtt_energy_debug_dec.csv aquamqtt_energy_debug_hex.csv