krahabb / meross_lan

Home Assistant integration for Meross devices
MIT License
408 stars 45 forks source link

MTS100 - MTS150 reporting Idle even when Heating #331

Closed giabett closed 2 weeks ago

giabett commented 9 months ago

No matter what you chose on the thermostat it's always shown IDLE, like here:

image

why? how to get rid of this idle-?

krahabb commented 9 months ago

Hello @giabett, I'd need more context here. Is it an MTS100 (or the likes) TRV or is it a MTS200 thermostat ? Moreover, a diagnostic trace would be needed to actually check all the messages exchanged with the device.

At first sight, if it's one of the TRVs it looks like it is 'internally' bugged i.e. it is not correctly reporting its own internal state. I can assure this since I too have 2 (MTS100) configured the same way and paired to HA/meross_lan and 1 is correctly reporting 'Heating' image While the other, no matter what, keeps reporting 'Idle' image

Screenshots taken with the pre-release version v4.4.0-alpha.1 but I remember this behavior showing since a long time ago. I really guess one of the TRVs is slightly malfunctioning. The strange fact is the 'bugging' valve is effectively opened to heat flowing and so it is currently heating even if it reports 'Idle' (like being closed). Moreover, since I've noticed this behavior in my own valves, I've double checked their messages looking for bugs in meross_lan decoding but, at the raw protocol level, the device is really reporting itself as idling.

Nevertheless, if you could manage to post a trace of the device I could double check what's going on (in order to be sure the valve should be 'Heating' you have to set the target temp higher than the environment one)

Last clarification:

giabett commented 9 months ago

It is a MTS150 two plus the hub

image

Even if the target temperature is not reached, I receive an IDLE indication. However, on the Meross app on the phone, I receive the correct indication (heating).

image

Both valves seem to be working fine. When the temperature is reached, they either close or open to maintain the right temperature.

krahabb commented 9 months ago

I'd really need the device diagnostic trace in order to inspect for possible bugs

giabett commented 9 months ago

msh300hk-1699353504 (1).csv this is the trace, thanks

krahabb commented 9 months ago

The trace itself states both the valves were 'idling' when the trace started but this could also be consistent with the fact that both the valves were 'in range' of the target temperature (at least at the time of the trace)

I see anyway a possible issue in the fact the devices are maybe not being 'polled' correctly.

This qurestions arise since the software is behaving like it is configured with also the Meross cloud profile and this in turns expects most of the updates coming from the cloud but I don't see any message incoming from there. It might be the trace was too short in time and nothing happened in the meantime on the devices in order to need to be notified through the cloud, but it could also mean something else is broken..

giabett commented 9 months ago

Hi yes the account is setup correctly there was nothing selected in the configuration, I now set auto

image

but so for no changes

krahabb commented 9 months ago

At the moment I can't see where the issue could lie. As previoously stated, I too have 2 mts100, both working ok, but one is always reporting idle no matter what, while the other correctly reports on/off (heating/idle) depending on the room and target temperature (I cannot see them in the app though, since they're onle locally connected to HA)

We can setup an experiment like this:

After collecting this first trace it would be nice to have a second try with the same sequence, but now please 'fix' the protocol to HTTP in the hub configuration entry, before starting the new trace so we can test what happens when meross_lan only uses local communication to interact with the valves and doesn't wait for cloud updates

giabett commented 9 months ago

msh300hk-1699429282.csv

this is the log with suggested indication in auto mode

giabett commented 9 months ago

msh300hk-1699436703.csv

this is with HTTP protocol

krahabb commented 9 months ago

Thank you, very helpful! By comparing the 2 traces I can assure the communication is correct both when using HTTP only and when using AUTO mode together with the cloud profile. In the end, everything works as expected but...still the devices don't report their heating state correctly and keep sending this value as 0 i.e. idling (in my knowledge)

It would be nice to have reports from other users too in order to see if this is a 'wide' issue or it is just related to some devices not behaving as expected. I'm still pretty sure it is some kind of issue in the device itself. Also, as a side note, I've found in the Meross FAQ something that suggests that maybe the valve is incorrectly mounted so it can't either work or maybe it just fails to detect its own state. This could explain why, in my house, 1 of the 2 is reporting correctly while the other is not. Also, I'm using rechargeable batteries (NiMh) and they're 'discouraged' since they don't offer the needed voltage (This is confirmed by the fact that I have to recharge them very often)

The only option left for meross_lan would be to just set the state based on the 'guess' that when room temp >= target the device is "Idling" while when room temp < target the device is "Heating" ignoring what the device itself is reporting in its messages. At that point the state would just be somehow cosmethic. I'm likely going to implement this kind of fix in the next release but still I'm not very convinced this is a 'good' solution

giabett commented 9 months ago

Thanks, Krahabb, for your report and analysis.

What I don't understand is that the Meross App states the correct status. When is heating it correctly shows "Heating",' if the set temperature is equal to the room temperature shows nothing, if higher it shows "Cooling" This suggests that installation issues can be ruled out.

As for why we only receive a 0 value, I'm not sure. However, there must be a discrepancy somewhere, as otherwise, the Meross App should consistently display 'Idle.'

Your workaround isn't a bad idea, though. When the room temperature is greater than or equal to the target temperature, the device is 'Idling.' When the room temperature is less than the target, the device is 'Heating,' disregarding what we get from the valve.

krahabb commented 9 months ago

Yes...I've also checked the HA component from @albertogeniola and it is implemented like this: Heating when temp < target idle when temp == target....maybe the app is doing the same ;) I'll go this route too....I'm just 'obsessed' by the fact one of my valves correctly reports the 'heating/idling'. I should re-pair those valve to the Meross app and check what's reported there but...too much work to do right now!

krahabb commented 9 months ago

@giabett, Great news (maybe!) I've managed to get my MTS100 to report correctly by just replacing its battery. Even if it was apparently working good (with battery at 25% and 0%) it was not: In fact, after replacing with a 'brand new topped up' battery, it started reporting the correct status (heating). Also, the room temperature reading was a bit off (1.5 °C) before replacing so I guess these valves, even if still working when the battery starts to drop off, are very sensitive to the voltage and so behave a bit erratically when the battery starts draining out

giabett commented 8 months ago

I replaced the battery with a brand new Duracell, but the problem persists. The indication has changed with the new thermostat tab, and I believe instead of target should now be heating, right?

image

davideantonelli commented 8 months ago

Same issue here, today morning was working well, this afternoon IDLE.

krahabb commented 8 months ago

I might also suspect the issue could arise if the valve is not properly machanically paired (i.e. the adaptor screw is a bit loose or somehow failing to get in mechanical contact with the hydraulic system of the heater)

When I've replaced the battery of my failing MTS100 in fact, the valve did a mechanical initialization procedure (where it starts buzzing a bit and likely trying to sense the mechanical pairing)

This turns me into thinking that maybe the problem lies in the length of the hydraulic piston: if it's too long or too short maybe the valve cannot sense it has run enough to actually open the hydraulic circuit

giabett commented 8 months ago

This is not the case in my situation, as the Meross app on the phone is working perfectly and reporting the right status every time, such as idle heating and cooling when the temperature changes. If there were some mechanical problem or installation problem, the app would also be reporting it incorrectly.

njharrison commented 7 months ago

Hello, I just wanted to add a comment - I have 16 Meross valves in my house, a combination of 100s and 150s, and all of them show “Idle” when they should show “Heating”.

Happy to get traces if it’s helpful!

Thanks

Nick

krahabb commented 7 months ago

Hello @njharrison, I'm trying to get back at this issue for which I still don't have a resolution. A standard trace could help

njharrison commented 7 months ago

Hello - no worries. A quick trace where I tried adjusting the temperature of one TRV and set it back again. msh300hk-1706114180.csv

krahabb commented 7 months ago

Hello @njharrison, Thank you for sharing the trace. I cannot say I've found the issue but I have some hints/reasonings to share.

Anyway: Your MTS(s) look like mainly working in schedule mode and this could have a caveat (say bug): when you set the temperature in HA while the MTS is in schedule mode, meross_lan sends a message to set the temperature of the manual mode but it is not actually changing the current working setpoint (since it is controlled by the schedule of the valve). The HA UI anyaway could show your newly changed setpoint in place (since the command was accepted by the valve, just didn't change the current schedule setpoint) So it might be that you see a setpoint in HA which is not the real current working setpoint of the valve. This is a bit exacerbated by the fact that with the Meross cloud profile in place, the software doesn't poll the valve anymore and the status update from the MTS could come at any time (or never)

This is a speculation based on reported behavior of MTS200 (#369), since my (very old) MTS100v3 instead, whenever you manually change the setpoint in HA, automatically switch out of schedule mode and go in manual mode so what you see (as a setpoint) in HA is also actually the current setpoint in the valve. In order to see if this is the issue you could try setting the protocol mode to HTTP (fixed - no auto) so that meross_lan keeps fully polling the valves and refresh the HA UI with consistent state from the valves.

The fact that meross_lan doesnt completely poll the state when the cloud MQTT is available is effectively an optimization which could anyway lead to inconsistences in reported state for several reasons (This has my attention and next release has already some more checks to avoid this extremely optimistic approach)

Another totally different reasoning instead is about the timing of your changes: in the trace you set back and forth the setpoint (which, for the previous reason might not have had an impact on the current working mode of the device) in quick succession but the valve, due to the internal firmware behavior, doesn't change its working state at least until a few minutes (maybe 2.. or 3) I usually experience this on mines:

Final (by now) consideration: I'm still strongly convinced the software is working ok in reporting current idle/heating of the MTS so I could be very biased in this regard. But I also think the protocol AUTO mode which leads to the aforementioned polling optimizations might be the devil. By setting the device mode to HTTP only we should be able to detect if the issue lies here instead.

krahabb commented 7 months ago

Hey @njharrison, I'm studying your trace and there's something funny into that: It appears as if a device (namely the one with "id" == "030011B6") is always repeated twice in the data. This in turns (being unexpected) raises an unhandled exception which might then skip correct parsing of messages. You should have notes of that in the HA log.

Also, you stated that you have 16 devices overall but assuming that the duplicated entry is not due to a real device having the same id as another one (so it is just some bug in the memory of the hub firmware), the hub is actually only communicating effective data of only 15 devices..

This is also supported by the fact that these 'duplicated entries' carry 0 values in one of the 2.

You might need to maybe re-pair the 'incriminated' device in order to tidy up the hub..dunno though

njharrison commented 4 weeks ago

Hi @krahabb!

Thanks for your reply back in January. I managed to get something working that ignores the Idle/Heating state and didn't fancy going back and changing everything!

I've since gone back and rebuilt everything "from scratch", including fully clearing out my Meross config.

I'm down to a single radiator thermostat and a single hub, and when I test it I still get the same behaviour. Attached is a diagnostic trace, if you manage to get a chance to look at it. The only unusual thing I can see are the messages:

"Appliance.Control.Multiple requests=3 (responses=3) expected size=1250 (actual=1295)" - what might explain the differences in the message sizes?

Thanks for your time and work!

Nick home-assistant_meross_lan_2024-08-02T13-32-55.128Z.log

krahabb commented 4 weeks ago

Hello @njharrison, The log is only relevant to debugging and the 'expected size' is just a guess of how big the response payload will be in order to not send requests which might lead the device network stack to overflow. Based off my experience the msh300 has roughly 5000 bytes allowed to prepare the response...if the response is too big it looks like it has no other 'safe checks' and usually either doesn't respond (over MQTT) or just truncates the response in HTTP. Since meross_lan might build requests for very huge responses (expecially for hubs) there's an algorithm which tries to estimate how big the response will be and eventually trims it (by splitting the request in multiples). The algorithm doesn't really know the exact size of the response, it just has some tables built on previous knowledge which usually lead to a somewhat correct estimate (but still an approximation).

What would be relevant, in the log message, is the number of responses against the requests..if they don't match then it means the request was likely too big (or malformed in some way). This actually might happen from time to time but it should be no issue in 'perceived behavior' since meross_lan has some code in order to recover from missing replies, so it just resends the requests for the missing responses. That's why this message is usually not reported when logging level is set to normal/default.

As for the 'heating/idle' of the valve I have to admit I've (almost) completely forgot about it ...time to finally implement a sw fix like proposed here (even if I really don't like it)

njharrison commented 4 weeks ago

Thanks very much for the detailed explanation - I guess that is one possible cause (in my head, anyway) ruled out.

The temperature comparison is how I'm doing it at my end, but sometimes the three-minute delay between setting the target and the thermostat responding causes a problem. I assume that you'd do it at the "read" end, which would negate that?

Basically, I want to turn on the boiler and pump only when one or more radiator valves are open. There are some workarounds (e.g. always leaving one radiator valve open) but if they all close, the system is hot, and there's nowhere for the water to cool then my crappy old boiler gets confused and trips very occasionally.

Anyway, I'll keep an eye on it if you do implement your SW fix!

Thanks again for all the info and your hard work!

krahabb commented 2 weeks ago

Latest release Moonlight.3.1 implements a 'software patching' for the issue. There's a new configuration switch on the device panel where you can enable this feature so that the component uses current temp < target temp to signal the valve state heating/idle. This is really a bad patch to be honest but still I con't figure out whyt some devices don't report correct heating state.