thorrak / tiltbridge

Tilt Hydrometer to WiFi Bridge
http://www.tiltbridge.com/
Other
61 stars 27 forks source link

TiltBridge MQTT data inconsistent in terms of time sequence #247

Closed fkroepfl closed 6 months ago

fkroepfl commented 7 months ago

As you can see in the attached screenshots, the data is obviously not being updated as specified in the TiltBridge settings.

You would expect the data to be updated every 30 seconds as set in TiltBridge, wouldn't you?

However, the age of the data increases occasionally up to hours and can probably increase even higher.

At the moment I have no idea how to analyze this in more detail to find out what is causing this behavior.

Is there an easy way to validate the transmitted MQTT data?

Bildschirmfoto 2024-02-06 um 19 56 04 Bildschirmfoto 2024-02-06 um 19 59 12
fkroepfl commented 7 months ago

Thanks to the latest HA version, you can download the sensor data history.

Exported are:

It is noticeable that the contents of "state", i.e. the values with which the HA then works, are inconsistent.

On the one hand, consecutive integers, then again without content or "unknown" and in between also values as they would be expected for the inclination sensors.

The history of this week's values for BLUE and RED can be found in the appendix.

Any ideas?

image homeassistant-datahistory-tilt-blue-and-red.csv

fkroepfl commented 7 months ago

The intervals between the timestamps are obviously between 00:00:03 seconds and 23:23:30 hours, although some of them correspond to the 30 second update rate set on Tiltbridge, most of them, i.e. not only the intervals but also the sensor data, are wrong or misinterpreted for whatever reason.

fkroepfl commented 7 months ago

I now have 2 HA installations in operation. Each with its own TiltBridge in separate networks.

Both data series from last night are inconsistent and also different from each other.

Does anyone have any idea what could be causing this?

image
fkroepfl commented 7 months ago

After I found the MQTT Explorer for HA, my further findings are as follows:

image
fkroepfl commented 7 months ago

After further investigation and a better understanding of what is going on here, it can be assumed that the values for temperature and specific gravity as well as the display are OK so far. (Hence the adaptation of the title).

However, the stream of MQTT messages to be received by TiltBridge repeatedly shows considerable gaps.

Comparing with previous message: + 1 line, - 1 line => „timeStamp" In this example there is a missing block of 1081 seconds, corresponding to 36 missing messages.

{
  "Color": "Red",
  "timeStamp": 24071,
  "timeStamp": 25152,
  "fermunits": "SG",
  "SG": "0.9990",
  "temp": "22.2",
  "tempunits": "C"
}

As can be seen in the "timeStamp" data history, the gaps occur several times within the observation period in a stable network environment and also occur differently for each Tilt under consideration (here RED or BLUE).

image

As far as updating the display in the Home Assistant dashboards is concerned, two situations overlapped.

a) The gaps in the MQTT transmission described here

and

b) The way HA displays the data, i.e. the timestamp in the display is only updated when the value itself changes. (Which happens less often when testing in water than in a real fermentation)

I hope that there is now enough information to find or eliminate the cause of the MQTT transmission gaps with this problem.

@thorrak: Regarding a possible optimization of the display in HA, e.g. by changing the MQTT autodiscovery template, a new issue would have to be opened, right?

fkroepfl commented 7 months ago

Issue title has been changed as this is independent of Home Asisstant.

thorrak commented 7 months ago

Thanks for all the hard work on debugging this. Like Grainfather, MQTT support is something that was added by someone else, and isn’t something that I use personally so tracking down issues like this is quite painful absent data like you’ve provided.

While I don’t know what exactly could be causing this off hand, I have an idea for something that might help. I’ll try to take a stab at this during this coming weekend and will let you know where I land.

fkroepfl commented 7 months ago

🤗

While I don’t know what exactly could be causing this off hand, I have an idea for something that might help. I’ll try to take a stab at this during this coming weekend and will let you know where I land.

One more aspect that occurs to me: can we be sure that TiltBridge is receiving/recording all data shared via BT without errors? If this should also play a role, then of course this cause cannot be detected or assigned on the MQTT broker. 😉

thorrak commented 7 months ago

While it's possible, I don't think that's the case here. TiltBridge caches data for several minutes, and the Tilt broadcasts about once a second, so the TiltBridge would have to be almost at the exact edge of the transmission range.

My guess is that it's similar to the Grainfather issue where there is an internal "data sender" that is persisted between requests, but would benefit from reinitialization. That's where I'm going to start when I get a chance to look at it this weekend.

fkroepfl commented 7 months ago

For your additional information: As I have already mentioned, I have two completely independent installations running in parallel. The following screenshot of the history of the last few days shows even bigger gaps in the second installation. Even though the really big gap only affects one tilt and the data from the other tilt is still being transferred during this period.

image

thorrak commented 7 months ago

I spent some time this morning rewriting the MQTT sender, in addition to upgrading to the latest version of the MQTT library. Although the changes that I made shouldn't have any impact on this issue, it's possible that in cleaning things up it got better -- and it's also possible that the library upgrade had an effect.

The latest firmware is now available as v1.2.2-beta2 if you want to give it a try.

fkroepfl commented 7 months ago

The latest firmware is now available as v1.2.2-beta2 if you want to give it a try.

Will be happy to try it out later today. 🤗

fkroepfl commented 7 months ago

TiltBridge: v1.2.2 [devel] (7884906) is now running on an ESP32_w/o_OLED board. (https://www.berrybase.at/en/esp32-nodemcu-development-board)

One question, because it has happened not only now, but several times before: It usually takes a few power cycles until the board with the TiltBridge firmware starts to transmit. I have noticed this with both boards. Possible causes? The boards themselves? The power supply used? Something else?

image

thorrak commented 7 months ago

How long are you waiting? In order to prevent everything from launching at once, there is a 65 second delay at startup before it will send the first datapoint. That begins after the initial startup processes run, so it could be up to ~90 seconds or so.

fkroepfl commented 7 months ago

How long are you waiting? In order to prevent everything from launching at once, there is a 65 second delay at startup before it will send the first datapoint. That begins after the initial startup processes run, so it could be up to ~90 seconds or so.

Valid question. I can't say at the moment. Sometimes it feels longer, but felt is not measured. 😉 I will pay close attention to the time in the upcoming reboots. 🤗

thorrak commented 7 months ago

If this firmware fixes this issue (#247) then when I push the changes for #248 I'll reorder the timing a bit. I think you're right in that it's unituitive that if MQTT can send every 30 seconds, it shouldn't be one of the later items to get triggered for the initial send. ;)

fkroepfl commented 7 months ago

As I watch the flow of MQTT messages, I also notice a change in the content ...

About TiltBridge: v1.2.2 [devel] (7884906)

{
  "Color": "Red",
  "timeStamp": 7217,
  "fermunits": "SG",
  "SG": "2\u0000.7\u00000",
  "Temp": "2\u0000.7",
  "tempunits": "C",
  "WoB": "2"
}

About TiltBridge: v1.2.1 [new_sender] (58a184c)

{
  "Color": "Red",
  "timeStamp": 96169,
  "fermunits": "SG",
  "SG": "1.0000",
  "Temp": "21.7",
  "tempunits": "C"
}
thorrak commented 7 months ago

The addition of "WoB" (weeks on battery) is something that another user submitted via a PR. The insertion of \u0000 into the temperature and SG strings on the other hand sounds like a bug. I'll take a look at that shortly and see if I can get that resolved.

fkroepfl commented 7 months ago

The addition of "WoB" (weeks on battery) is something that another user submitted via a PR.

I thought so ... 😉

The insertion of \u0000 into the temperature and SG strings on the other hand sounds like a bug. I'll take a look at that shortly and see if I can get that resolved.

I can only confirm this, because the content cannot be displayed by default.

thorrak commented 7 months ago

I just pushed out v1.2.2-beta3 which should contain the fix for the strange unicode injection you noted, as well as the changes discussed in #248.

If this works, please let me know and I'll get it merged.

fkroepfl commented 7 months ago

I just pushed out v1.2.2-beta3 which should contain the fix for the strange unicode injection you noted, as well as the changes discussed in #248.

Looks good at first glance, but there seems to be a device config issue with the device name for tilt "blue" in all 3 config messages ...

{
  "dev_cla": "temperature",
  "unit_of_meas": "°C",
  "ic": "mdi:thermometer-lines",
  "name": "Tilt Temperature - Blue",
  "val_tpl": "{{value_json.Temp}}",
  "uniq_id": "tiltbridge_tiltBlueT",
  "stat_t": "tiltbridge/tilt_Blue",
  "dev": {
>>. "name": "Tilt Red",  << should be BLUE, right? >> 
    "name": "Tilt Blue",
    "ids": "Blue",
    "mdl": "Tilt Hydrometer",
    "mf": "Baron Brew Equipment LLC",
    "sw": "1.2.2",
    "sa": "Brewery",
    "cu": "http://192.168.1.192/"
  },
  "json_attr_t": "tiltbridge/tilt_Blue",
  "json_attr_tpl": "{ \"Uptime\": \"{{ value_json.timeStamp }}\" }\n"
}

homeassistant/sensor/tiltbridge_tilt_Blue/temperature/config
thorrak commented 7 months ago

Ah, oops: https://github.com/thorrak/tiltbridge/blob/8f8e3babedae33b925778102a9bf6dc2840a1412/src/sendData.cpp#L747

The fix is compiling now (but admittedly - I wrote it from my phone on the train, so it’s possible that it won’t work). It will be ready momentarily as v1.2.2-beta4

fkroepfl commented 7 months ago

i.e. it was changed from a fixed string to a variable?

fkroepfl commented 7 months ago

The following user data message was the last message hours ago. It appears that another power reset is required on the second system. Sorry, but in this case it ended after around 425 messages.

{
  "Color": "Red",
  "timeStamp": 12722,
  "fermunits": "SG",
  "SG": "0.9990",
  "Temp": "27.8",
  "tempunits": "C",
  "WoB": "3"
}
thorrak commented 7 months ago

Just to close the loop if anyone finds this - it sounds per your comments in #256 that this is fixed with some of the recent changes, and thus will likely be resolved when PR #246 is merged. Going to link this to that PR and tag it as pending pull.

fkroepfl commented 6 months ago

About TiltBridge: v1.2.2 [no_std_strings] (202037d) For your information: A closer look at the data stream shows only small gaps in the message sequence. What is interesting, however, is the fact that sometimes only data from a single TILT is missing during the gap. image