nebulous / infinitude

Open control of Carrier/Bryant thermostats
MIT License
224 stars 50 forks source link

Thermostat reverts to old config after being changed by Carrier cloud API #184

Closed dragonflight closed 4 months ago

dragonflight commented 6 months ago

I really should have reopened #153, but it is a year old and not of much value, other that the problem has been around for a log time.

Nebulous, I understand your feelings on this topic, but once the problem is understood I believe the fix is quite simple. I believe I understand the problem (mostly) and more importantly understand a fix.

In the following I use configuration to mean the actual configuration stored in the cloud/tstat and msg to be the contents of the message stored in the tstat and in the state directory of the the infinitude app and/or the msg sent via tcp.

The problem arises when the configuration stored in the cloud and sent to the tstat via the config msg gets out of sync with the configuration stored in the tstat and sent to the cloud via system/config msg.

Infinitude uses the contents of the system/config msg as the definitive configuration of the system.

The problem arises when the cloud is changed (via the carrier/bryant app). The cloud send a config msg to the tstat which changes the the internal configuration of the tstat - so far OK, BUT the contents of the system/config msg hasn't been changed. After some time (1 or 5 min) infinitude sends a config message to the tstat using the OLD config in the stored system/config msg resetting the configuration of the tstat and by some "magic" causing a system/config message to be sent to the cloud resetting the configuration of the cloud.

So the problem is understood with the exception of the "magic" that causes the tstat upon reception of the system/config msg to trigger the sending of a system msg (and updating the the system/config), but not the config msg. However, thanks to my poor typing, I discovered by accident that if you send a parameter with a bad value in a config msg it will trigger a system msg.

The solution is after 1 minute, instead of sending the system/config msg one sends a message like

<config/mode>xyz</mode></config>

then the tstat will reply with a system msg with the updated tstat configuration. I have tested this and it seems to work. I would suggest shortening the 1m considerably - probably to the next status message to minimize the window for both the cloud and infinity to attempt a change at the same time.

As if this issue isn't long enough, to make the app actually useful it would be nice to modify the pass_reqs when connected to the app. I think it should be as simple as if a real status message has a ping time of <20 sec, pass_reqs=1 else pass_reqs=orig_pass_reqs.

I would have submitted these changes as a pull request, but I'm not really comfortable with even reading perl any more. If changes are made I am willing to test them.

dragonflight commented 6 months ago

Apologies to all that got the first version of my "comment" I used angle brackets around all the message names only to discover that meant they all disappeared once I had sent it DUH!!!! (as if it wasn't incomprehensible enough as is!!!!) It makes somewhat more sense now

nebulous commented 6 months ago

For reference here's the general sequence of events. Of note, the server/carrier app never pushes data to Infinitude or the thermostat - it is a polling-only operation. Just looking at it one can see how fragile it is though - with changes being possible at each of the three locations. I'd venture to guess and it's only a guess, that the issue is around step 6 - Infinitude knows there are changes and sets the appropriate flag, but when it goes thru the config cycle to get them, it gets them from the Infinitude store, not from Carrier.


sequenceDiagram
    Thermostat->>Infinitude: POST status
    Infinitude-->>Carrier: POST status(if PASS_REQS expired)
    Carrier-->>Infinitude: response boolean if changes are present from Carrier app
    Infinitude->>Thermostat: response boolean(either from Infinitude or from Carrier response)
    Thermostat->>Infinitude: if changeflag POST config
    Infinitude-->>Carrier: if PASS_REQS expired POST config xml
    Carrier-->>Infinitude: config xml
    Infinitude->>Thermostat: config.xml
dragonflight commented 6 months ago

Your picture is incomplete, you missed what happens next. The code sets a 1 minute timer (line 381) and then 1 minute later the tstat request the config again and Infinitude replies with a config msg from the system/config message store which is the old config, hence reverting the configuration back to before the server/carrier app changed it. Here is an update diagram assuming "Carrier" has changes

sequenceDiagram
    Thermostat->>Infinitude: POST status
    Infinitude-->>Carrier: POST status(if PASS_REQS expired)
    Carrier-->>Infinitude: response boolean changes are present
    Infinitude->>Thermostat: RESPONSE boolean changes are present AND open PASS_REQS for 2 min
    Thermostat->>Infinitude: GET config
    Infinitude-->>Carrier: GET config
    Carrier-->>Infinitude: RESPONSE config.xml
    Infinitude->>Thermostat: RESPONSE config.xml AND store config.xml
    Note over Thermostat,Carrier: Approx 1 min
    Thermostat->>Infinitude: POST status
    Infinitude->>Thermostat: RESPONSE boolean changes are present
    Thermostat->>Infinitude: GET config
    Infinitude->>Thermostat: RESPONSE config.xml from system/config store 

The difficulty in fixing this, (other than abandoning supporting the app :)) is that although the configurations as represented by infinitude store system/config and config and actually stored in the thermostat and stored in the app are the same, they are different versions and not lexically the same (and have some different properties).

Infinitude has made the decision to treat the system/config message as the canonical or definitive definition. Infinitive's api reads and modifies it. When the server/carrier app makes a change to its configuration and sends it to the tstat via the config message, that updates the tstat's config, but NOT infinitude's view of the configuration i.e. system/config.

The solution is to update Infinitude's view and the easiest way to do that is force the tstat to issue a system msg thereby updating Infinitude's store of system/config.

As mentioned earlier fortunately there is a way to effectively ask the tstat to send a system message. By requesting (through the status response) a config msg and sending

<config/mode>xyz</mode></config>

The tstat will respond with a system msg thereby updating infinitude's store of system/config.

I hope this is clearer than last time

nebulous commented 6 months ago

Thanks for continuing the diagram - it is indeed slightly clearer what you're trying to say, but I'd much rather solve the problem directly vs yet another weird bandaid in an already hacked up project.

I think what you're saying is that config is stored in two keys which get out of sync, and Infinitude's simplified key wins on subsequent calls. If that's the case, then it seems that the problem is around step 7 when infinitude stores the config response from the cloud service. If it's storing only the long config key that contains the systemid and not also storing the simplified key then we could store both at the same time and be done with it, no?

dragonflight commented 6 months ago

No, the problem is there are two distinct keys/messages. config where the data is in the response from carrier/app to the tstat (if we ignore Infinitude for the moment) and being from the Carrier is the latest version 1.48. system where the data is in the req from the tstat to carrier/app and is at version 1.41 (assuming running software 4.05. system has one and only one element and that is <config> which I have been calling system/config (I had problems with the markup stuff - so a little confusing)

In fact system/config elements are a subset of config elements with some changes in ordering. In my C implementation of Infinitude I had the structures created such that there was only a single instance of each common element, but I think that would be a MAJOR design change for this PERL version.

Unfortunately no matter what happens there is the problem of collisions even in the case of someone at the tstat at the same time a someone (or more realistically come program) is accessing Infinitude. One will overwrite the other. Other than that, I think sending the fake config message as a "request system config message" is a reasonably solid concept and less of a hack, and more filling a hole in the protocol (!?!?)

There is still the possibility of finding out why when the tstat receives the message based on system/config it almost always triggers a system msg, but I couldn't figure it out.

Unfortunately I have been sick for a long time now that limits the amount of time I can be upright and right now on top of that I seem to have some awful cold that I can't shake, so I'm not as eager as I might be to investigate.

That said I am hopeful that I will be able to (eventually) discover what currently causes the tstat to respond with the system and while still somewhat of a hack it would guarantee that anytime config, system/config would be updated.

Also, I don't actually use infinitude, I have my own version, but I have changed my own version to have the same problem (two separate data stores for config and system/config) and have implemented the solution I have suggested and so far it seems to work using the old app, the new app and my telnet interface

nebulous commented 6 months ago

ok I think I'm closer to following then. Returning an invalid config response to the thermostat(after allowing the stat to update with Carrier's supplied config) causes the thermostat to trigger a system post, which effectively forces the thermostat to do the version reconciliation rather than having Infinitude do the merge, or support multiple config schema versions.

dragonflight commented 6 months ago

Yes, so the message sequence looks like

sequenceDiagram
    Thermostat->>Infinitude: POST status
    Infinitude-->>Carrier: POST status(if PASS_REQS expired)
    Carrier-->>Infinitude: RESPONSE boolean changes are present
    Infinitude->>Thermostat: RESPONSE boolean changes are present AND open PASS_REQS for 2 min
    Thermostat->>Infinitude: GET config
    Infinitude-->>Carrier: GET config
    Carrier-->>Infinitude: RESPONSE config.xml
    Infinitude->>Thermostat: RESPONSE config.xml AND store config.xml
    Note over Thermostat,Carrier: Some other messages like POST notification and/or STATUS messages with request config set
    Thermostat->>Infinitude: POST status
    Infinitude->>Carrier: POST status
    Carrier-->>Infinitude: RESPONSE boolean changes ARE NOT present
    Infinitude->>Thermostat: RESPONSE boolean changes ARE present
    Thermostat->>Infinitude: GET config
    Infinitude->>Thermostat: RESPONSE fake config.xml with error
    Thermostat->>Infinitude: POST system
    Infinitude-->>Carrier: POST system
    Carrier-->>Infinitude: RESPONSE OK
    Infinitude->>Thermostat: RESPONSE OK

I think rather than wait for a timeout of 1 min it would be better to wait until there is a status message that does not have the config changes flag set. That way if there are two changes one after the other there is less chance trampling over the second one with the first one.

Just a reminder, that to really make infinitude with the app PASS_REQS needs to be manipulated as I suggested in the first comment (perhaps only with a flag to enable/disable behaviour for app).

Another issue with PASS_REQS is for the app to work one always has to transmit system messages. the app depends on the cloud always having the latest system/config from the tstat and from what I have observed it is only sent at tstat startup (or making a new connection to the internet) and when the user manipulates the tstat UI. As I have mentioned there is some "magic" yet to be discovered why the config message generated from the system/config store causes the system message to be sent, but it is indeed fortuitous.

EDIT: I just fixed the sequence diagram to indicate that the status message is passed on to the cloud/carrier (otherwise how would Infinitude wait for a status message without the changes flag set - DUH)

github-actions[bot] commented 4 months ago

This issue has not been active in a while. It will be automatically closed soon absent further activity.