pvvx / ATC_MiThermometer

Custom firmware for the Xiaomi Thermometers and Telink Flasher
https://github.com/pvvx/pvvx.github.io/tree/master/ATC_MiThermometer
Other
2.96k stars 205 forks source link

[Feature request] Please make log averaging optional #358

Open joshuakraemer opened 1 year ago

joshuakraemer commented 1 year ago

At the moment, the log data is always stored as the average of n measurements. Instead, I would like to simply store every n'th measurement without averaging. With averaging, it's impossible to compare/combine logged and published measurements. It's also impossible to tell if a temperature was above or below a given limit at a certain timepoint. Could you please introduce an option to enable/disable averaging? By the way, thank you very much for your work with this great firmware!

pvvx commented 1 year ago
  1. The available flash volume allows you to save only 19632 measurements.
  2. Additional battery current is required for each recording.
  3. Additional battery current is also required to read and transmit values. Frequent readings of all records severely drain the battery. On a good BT adapter, the process of reading all stored values takes seconds, and on terrible ESP32 - minutes. Getting 50 measurements on ESP32 (12 sec): image (current on the thermometer mA/ms): BT adapter or Smartphone (1.2..1.5 sec): image (mA/ms scale) As a result of several readings of all measurements on the ESP32, and the battery will be discharged. The problem is the processing speed of BLE in ESP32. The thermometer has to repeatedly duplicate transmission packets, because ESP32 does not have time to process them.

As a result, the proposed functionality is specific, since hardware modification is desirable for its application - connecting an additional power source. And under such conditions, it is not difficult to fix the source code yourself. But it's not for everyone...

joshuakraemer commented 1 year ago

Thanks for your answer. I think I didn't explain my request correctly. I don't want to save all measurements. In case of a 30 min logging interval, at the moment all measurements inside the logging interval are averaged. Instead, I want just the last of those measurements to be saved. The amount of data saved would be the same (1 measurement every 30 minutes), but the saved data would represent the actual measurement at the given timepoint instead of an average.

pvvx commented 1 year ago

Then this goal is lost:

It's also impossible to tell if a temperature was above or below a given limit at a certain timepoint.

As a result, there are many options, such as recording the average, minimum and maximum for the period. All this variety of options is easier to implement in external software, rather than loading a weak chip and battery. The main use case of thermometers is designed to work in the "smart home" system with constant collection and processing of the received measurements. It is possible and much easier to implement any logging options there.

The reaction time to changes in external temperature and humidity for existing thermometers is quite long and recording peak values is difficult. With changes, minutes pass until the inner part of the printed circuit board with the sensor warms up or cools down. As a result, the values obtained are already averaged and lag behind the real ones.

The sensor has noise in its readings. Software averaging improves accuracy.

joshuakraemer commented 1 year ago

Thanks, I understand all of those points. With more values averaged, you get higher accuracy but also higher lag. The choice of averaging is thus a tradeoff. With log averaging, the lag is too high for my use case. This is why I would prefer lower lag, accepting the loss in accuracy.

I've already implemented logging on a central server. The internal memory of the thermometers serves as a backup only. In case of a lost connection, I would like to read the logged values from the thermometer and add them to the central server database to fill the gaps. But because of the additional averaging in the thermometer's log, the data don't match in case of a change in termperature (too much lag in the thermometer's log).

pvvx commented 1 year ago

This is why I would prefer lower lag, accepting the loss in accuracy.

In case of a lost connection, I would like to read the logged values from the thermometer and add them to the central server database to fill the gaps.

?

image

No averaging and a complete log.

Sampling one of several values is by no means a correct measurement.

pvvx commented 1 year ago

For clarity, I describe the assembly of the version with the ability to measure 2 analog channels instead of the T-H sensor.

/* Special DIY version - Voltage Logger:
 * Temperature 0..36.00 = ADC pin PB7 input 0..3.6V, LYWSD03MMC pcb mark "B1"
 * Humidity 0..36.00 = ADC pin PC4 input 0..3.6V, LYWSD03MMC pcb mark "P9"
 * Set DIY_ADC_TO_TH 1 */
#define DIY_ADC_TO_TH   1

Settings:

image

We feed the sine from the generator:

image

Triangular signal from the generator:

image

Similarly, with a temperature and humidity sensor, measurements can be set in increments of 63 ms. You just need to install a more powerful battery or use a 3.3V source.

pvvx commented 1 year ago

In case of a lost connection

To get the values from the thermometer, "BLE connection" is not used. In the latest firmware versions, it was necessary to limit the "connection interval" for the stability of the connection with various BT adapters. As a result, when connected, consumption increases. The connection to the thermometer is used in cases of changing settings and rare readings of the log (such as once every 3 months or when changing the battery).

And:

This is the hardware specifics of all variants of thermometers with CR xxx elements. Short-pulse consumption with a long delay from the battery in the BLE advertising mode allows you to work until the battery is completely depleted. Due to violations and neglect of the BLE specification for connection, external adapters require a connection start period with significant consumption, since each time they poll the entire descriptor table, which should have been kept in memory from the last connection, while setting a short polling interval. A connected CRxxx battery cannot produce such a current for the period of this connection start (Until the thermometer sends the adapter a command to increase the "connection interval" and the adapter switches to this interval). At this initial moment of connection, due to a strong drawdown of the battery voltage, the connection is terminated... This does not happen if the correct program is used in the BLE chip on the receiving side. I can't change the software in your adapter and in the OS. Not to mention the terrible implementation of BLE in Linux. Linux has never mastered Bluetooth 5.0, which was released in 2016.

joshuakraemer commented 1 year ago

To get the values from the thermometer, "BLE connection" is not used.

I'm sorry, I've used the wrong terminology. With "lost connection", I've meant a situation where broadcast (advertising) messages are no longer received and logged by the server (for example, in case of a power failure of the server), but the thermometer keeps working and can still save measurements on its internal flash memory.

No averaging and a complete log.

Unfortunately, a 1x logging interval doesn't work for me, because the flash memory will be full too soon.

Sampling one of several values is by no means a correct measurement.

I disagree. Averaging is not better than sampling. Those are just two different methods with different tradeoffs, which are suitable for different use cases. In my case, averaging of time series data over several minutes is not correct. The following plot shows a test with temperature changing from 25°C to 0°C and back to 25°C. Blue dots represent broadcast data (1 min measurement interval) and red dots represent data saved on the internal memory (5 min logging interval). As you can see, because of the averaging, the data doesn't match.

plot

I would really appreciate an option to enable/disable averaging, so the right logging method can be chosen according to the use case.

pvvx commented 1 year ago

The following plot shows a test with temperature changing

Here it is necessary to apply a correction - subtract the offset of the averaging window and the graphs will match.

(for example, in case of a power failure of the server)

Today Linux has another problem with random or unexpected power outages. Flash drives (SD, eMMC, SSD, NVME, ...) do not have a quick disconnect feature. In such cases, they have a very high probability of breakage due to the loss of markup and internal program. This has only recently been taken care of and there is still no full implementation as it is kernel related. The only way out is that the registration devices must be connected to an uninterruptible power supply (UPS). If this is a "Smart Home", then it requires the mandatory possibility of notifications for analyzing data from sensors for the entire period of the absence of external power for power devices. Otherwise, why is it needed?

The function of recording measurements in thermometers was introduced due to performance limitations of the systems used and HDDs. Otherwise, the size of the database begins to exceed all reasonable limits, and the search in it, as well as plotting, turns into long waits. As a result, for example, in Home Assistant, you have to use pruning of the database by time, for example, for a month. If it is required to restore data from the thermometer for a longer period, then it is possible to read the measurement memory from the thermometer itself. Another option is to use a thermometer without a permanent external receiver to record logs and then read them as needed. Curves and constantly hanging implementations with ESPHome are not considered due to the absence of these problems in thermometers.

joshuakraemer commented 1 year ago

Thank you very mich for your thoughts.

Here it is necessary to apply a correction - subtract the offset of the averaging window and the graphs will match.

This would just help to create a nice looking chart. But it still wouldn't be correct to combine data processed in different ways (broadcast data without averaging and logged data with averaging) and further process/analyze the combined data.