Futureproof database queries

viktorChekhovoi commented 1 year ago

Currently we allow the user to enter any possible depth, and use the measurement interval in the database to pick the closest depth that actually has measurements there. However, if the measurement interval was changed in the future, it might cause some of the queries to output false data. Modifying the queries to use a different mechanism for picking depth will make the project more reliable long-term.

viktorChekhovoi commented 1 year ago

Another way to achieve this would be to query each measurement configuration separately. These configurations are stored in the database and contain general information about the DTS unit, including the measurement interval. Sending a separate query for each configuration can resolve this issue as well.

friesenh commented 1 year ago

It would also be helpful to talk with Bruce Duffy, the Technical Assistant in Physics and Astronomy and the maintainer of the database and this project, about these configuration concerns, as he is the one who brought our attention to this issue.

ababinet commented 1 year ago

Another way to achieve this would be to query each measurement configuration separately. These configurations are stored in the database and contain general information about the DTS unit, including the measurement interval. Sending a separate query for each configuration can resolve this issue as well.

Related to the approach of querying the configurations separately, Bruce offered some input into what that could look like: Greetings Alia (and Team),

First email from Bruce

There are ways to do this now without requiring a DB/code change.

But first I wanted to make sure you are aware of the fact that there can be time gaps in the measurements associated with a given dts_config. This can happen for two reasons:

1) An outage, as listed in the new measurement_outage table.

If you use psql and dump the measurement_outage table you will see a number of outages.

An important thing to note is that these measurement_outage rows are channel specific and the channels are specific to their parent dts_config. The most notable case of an outage that affected only one channel was when channel 3's fiber was physically compromised and produced bogus data:

2) Another source of time gaps in a given configuration's period of activity "dts_config swapping" where is that Jonathon or I might tell the XT_DTS device to use configuration A, then tell it to use configuration B, then tell the XT_DTS to use configuration A again.

Historically this has happened once, when I interrupted dts_config 'geothermal-config-2019-05-15' to run another dts_config with "experimental parameters" for Geo student Jake Gallant. I did not load that configuration's measurements to the database because I didn't want to confuse anyone, and instead logged it as another measurement_outage that shows up in the measurement_outage table as having an outage_type of 'other_config':

<SQL>
dts_db=> select * from measurement_outage where outage_type = 'other_config';
id | channel_id | outage_type  | start_datetime_utc  |  end_datetime_utc  
----+------------+--------------+---------------------+---------------------
73 |         15 | other_config | 2020-02-26 18:15:51 | 2020-02-27 19:33:21
74 |         16 | other_config | 2020-02-26 17:57:53 | 2020-02-27 19:45:22
(2 rows)
</SQL>

But in future we could switch between different configurations if, for example, the GEO dept took the XT_DTS on a field trip to measure a lake or a stream, and then returned it to the chapel basement and hooked it back up to the fiber optic cables there. In that scenario I would upload the data from that alternate configuration for another fiber configuration and NOT log it as an outage, but a time gap would still exist in the original configuration's measurements.

So to present the complete picture of dts_config measurement possibly segmented time spans requires finding each dts_config, each channel in that config, and each discrete measurement timespan in that channel.

I can imagine presenting this to the user something like this, where configs could be sorted by the time of the first measurement for any channel belonging to that config:

Config A measurement_interval_s (usually 1800 seconds (30 minutes) step_increment_m (usually 0.25 meters) channel 1 channel length in meters (start date, end date),(start date, end date),(start date, end date), channel 3 channel length in meters (start date, end date),(start date, end date)

Config B measurement_interval_s (usually 1800 seconds (30 minutes) step_increment_m (usually 0.25 meters) channel 1 channel length in meters (start date, end date),(start date, end date),(start date, end date), channel 3 channel length in meters (start date, end date),(start date, end date)

Config C measurement_interval_s (usually 1800 seconds (30 minutes) step_increment_m (usually 0.25 meters) channel 1 channel length in meters (start date, end date),(start date, end date),(start date, end date), channel 3 channel length in meters (start date, end date),(start date, end date)

And then let them select one or more channel time series (start date, end date) tuples on a per channel basis to search from.

Constructing a query to give you the per-config, per-channel timespans

In terms of how to assemble the information needed to present the above to a user I would combine the results from two queries per channel per dts_config.

The first would identify the temporal endpoints or "outer bounds" of the timespan covered by that channel. In other words, the earliest and latest measurement.datetime_utc values for each channel belonging to that dts_config.

The second query would identify the time gaps in each channel's measurements that exceed some time duration threshold. There are two approaches:

1) Implement the future proof solution that covers the 'dts_config switching' scenario described above and leverages the sql my student worker Kimberly Yip wrote to identify the outages that ended up in the measurement_outage table. See the attached file.

2) "Cheat" and use the measurement_outage table. This would work for now, but is not future-proofed against the "dts_config swapping" scenario I described above.

I'm being handwavy here because it's your project :)

Best,

Bruce

Second email from Bruce

Hi Alia and Team,

I've thought some more about tracking outages, specifically how to track outages between the times different configurations are running. In my initial response I basically implied/asserted that outages between different configuration runs should not be tracked in the measurement_outage table. That was certainly what I had in mind when I defined the 'measurement_outage' table to reference a particular channel from a particular configuration and configured it as a NOT NULL column.

Here's the measurement_outage table for reference:

CREATE TYPE dts_schema.outage AS ENUM ('unknown', 'other_config', 'power_outage', 'hardware_failure', 'fiber_failure', 'time_slip');

CREATE TABLE dts_schema.measurement_outage ( id serial primary key NOT NULL,

channel_id INT references dts_schema.channel(id) NOT NULL,   -- 1) If an outage spans dts_config timeframe
                                                                                                            -- then we must define one for each config.
                                                                                                            -- 2) if an outage affects both channels in a
                                                                                                            -- in an XT-DTS configuration (dts_config),
                                                                                                            -- then one record will be created for each
                                                                                                            -- channel.

outage_type outage,                                                                          -- From the enumerated type 'outage'

start_datetime_utc timestamp(0) without time zone NOT NULL,  -- The beginning of an outage time range

end_datetime_utc timestamp(0) without time zone NOT NULL     -- The end of an outage time range.
                                                                                                            -- NOTE: "Ongoing" is represented as any
                                                                                                            -- datetime value in the future such as
                                                                                                            -- "9001-01-01 00:00:00"

);

But I think what you guys REALLY want from the measurement_outages table is a way to track ANY outages, even downtimes between running different configurations or dealing with hardware failure/repair interludes. If that were true then it wouldn't be "cheating" to use the measurement_outage table to identify gaps in coverage (see "Approach 2" in my initial response above).

If that's the case then I would ALTER the table below to make the measurement_outage.channel_id column NULL-able so that an measurement_outage row could exist that is not tied to a configuration:

I would then redefine the existing measurement_outage rows that deal with changing configurations or hardware downtime to not have a channel_id value. That would apply to these rows (the id values may be different than given below):

dts_db=> select * from measurement_outage; id | channel_id | outage_type | start_datetime_utc | end_datetime_utc ----+------------+------------------+---------------------+--------------------- . . . 69 | 16 | unknown | 2019-08-01 17:51:35 | 2019-08-01 19:12:38 70 | 15 | time_slip | 2019-08-01 00:00:00 | 2019-11-23 23:59:59 71 | 16 | time_slip | 2019-08-01 00:00:00 | 2019-11-23 23:59:59 72 | 16 | fiber_failure | 2019-07-07 21:16:19 | 2020-10-21 23:53:25 73 | NULL | other_config | 2020-02-26 18:15:51 | 2020-02-27 19:33:21 74 | NULL | other_config | 2020-02-26 17:57:53 | 2020-02-27 19:45:22 75 | NULL | hardware_failure | 2022-09-04 21:22:02 | 2222-01-01 00:00:00 76 | NULL | hardware_failure | 2022-09-04 21:34:05 | 2222-01-01 00:00:00

This change would accurately record all outages in the system over all the configuration runs, but not make false assertions that a given outage happened during a particular configuration's run when it didn't like the two 'other_config' and two 'hardware_failure' rows currently do.

Does that make sense?

Related to all this is Jonathon Cooper recently received the repaired XT-DTS 19081. It has the same name 'XT-19081' but it's internal computer was entirely replaced and imaged with an upgraded OS and DTS software. So Jonathon and I defined a new configuration for it named 'geothermal-config-2023-01-13'. It's been taking data since 2023-01-13 @ 4:57PM Central Time US (UTC-6:00), and at some point I will start uploading that data to the database.

When I start uploading the new data with the new config I will also change the last two 'hardware_failure' measurement_outage rows above by changing the 'end_datetime_utc' value to be 31 minutes before the first measurement taken by the new XT-DTS.

Bruce measurement-time-gap-v3-with-verification.txt

acdalal / geothermal

Futureproof database queries #10

First email from Bruce

Second email from Bruce