ngardiner / TWCManager

Control power delivered by a Tesla Wall Charger using two wires screwed into its RS-485 terminals.
The Unlicense
130 stars 55 forks source link

MQTT setStatus(totalAmpsInUse) race condition #519

Open RichieB2B opened 1 year ago

RichieB2B commented 1 year ago

I'm experiencing the following: after charging is stopped, the MQTT topic TWC/all/totalAmpsInUse is never updated. It will be stuck at whatever Amps were used just before the charging stopped. This is a race condition with the MQTT rate limiting.

This code causes setStatus() to return if the MQTT message would violate the rate limiting: https://github.com/ngardiner/TWCManager/blob/93a560680e7a7feb9e5d908f1c0088c18a9ef799/lib/TWCManager/Status/MQTTStatus.py#L86-L88

However, if this was the 0A message for totalAmpsInUse it will be ignored and not change for a long time (until a new charge starts) because of this code: https://github.com/ngardiner/TWCManager/blob/93a560680e7a7feb9e5d908f1c0088c18a9ef799/lib/TWCManager/TWCSlave.py#L580-L587

So using an MQTT client I see this:

mosquitto_sub output ``` $ mosquitto_sub -v -t 'TWC/#' TWC/all/currentPolicy Track Green Energy TWC/all/maxAmpsForSlaves 0.95 TWC/all/totalAmpsInUse 16.77 TWC/config/minAmpsPerTWC 2 TWC/2666/ampsInUse 16.77 TWC/2666/chargerLoadInW 0 TWC/2666/ampsMax 0.0 TWC/2666/state 0 TWC/2666/power 0.0 TWC/2666/carsCharging 0 TWC/2666/lifetimekWh 49103 TWC/2666/voltagePhaseA 236 TWC/2666/voltagePhaseB 248 TWC/2666/voltagePhaseC 237 TWC/2666/lifetimekWh 49103 TWC/2666/voltagePhaseA 237 TWC/2666/voltagePhaseB 249 TWC/2666/voltagePhaseC 240 TWC/2666/chargerLoadInW 0 TWC/2666/carsCharging 0 TWC/all/currentPolicy Track Green Energy TWC/2666/ampsMax 0.0 TWC/2666/state 0 TWC/config/minAmpsPerTWC 2 TWC/all/maxAmpsForSlaves 0.95 TWC/2666/lifetimekWh 49103 TWC/2666/voltagePhaseA 238 TWC/2666/voltagePhaseB 249 TWC/2666/voltagePhaseC 237 TWC/2666/chargerLoadInW 0 TWC/2666/carsCharging 0 TWC/all/currentPolicy Track Green Energy [etc] ```

carsCharging = 0 but also totalAmpsInUse = 16.77 makes no sense.

This can be fixed by disabling rate limiting for the MQTT status module but I think the real fix is to always call setStatus(ampsInUse) for every receive_slave_heartbeat() in TWCSlave.py. That way the rate limiting will only be handled by the status modules. Trying to rate limit status messages in multiple places does not work properly.

ngardiner commented 1 year ago

Good points. I'm more leaning toward removing the rate limiting from the MQTT module, the reason for that view is that the logic of the value changed check is to change the event trigger from heartbeat received (which happens constantly) to value changed (which might not happen for long periods of time if the TWC is not in use), which should have removed the need for us to code bespoke rate limiting code per status module as we're only responding to actual changes in values.

I can't remember what the basis of needing this for the MQTT module specifically was - maybe it was due to the reconnect behaviour of the module - I can see a scenario where the constant connections and disconnections in response to minor changes in individual values caused high utilisation for MQTT brokers. If so, that was attempting to fix the symptom and not the cause.

ngardiner commented 1 year ago

And one last option (just in case we needed another option...) if having rate limiting for status modules is important would be to centralise rate limiting code within a function which all the status modules could then call to perform rate limiting. The important point being that the rate limiting code could have a more refined algorithm defined, eg rate limit where values are within a specified delta band (if numerical and value is not zero).

RichieB2B commented 1 year ago

Agreed, central rate limiting is preferred over adding it to all status modules. If rate limiting is needed at all.

I can change the PR to remove the rate limiting from the MQTT status module and leave the code in TWCSlave.py. This is an easy and acceptable fix IMHO.

RichieB2B commented 1 year ago

I've changed my point of view on this. Since every heartbeat already sends out updates for carsCharging, state, etc there is really no harm in also updating totalAmpsInUse and ampsInUse with every heartbeat as well. Even if their value did not change.

I have coded a simple MQTT Prometheus exporter which makes it easy and useful to detect stale MQTT topics. Having a topic only update when the value changes negates such detection.

Either we should update all status messages with every heartbeat or only when their values change but this should then be the case for all status topics IMHO. I'd rather see the former than the latter.