PIC-IRIS / PH5

Library of PH5 clients, apis, and utilities
Other
15 stars 9 forks source link

[BUG] ph5availability output for SmartSolo does not cover all expected times #499

Closed hrotman-pic closed 1 year ago

hrotman-pic commented 2 years ago

Describe the bug (overview) Output from ph5availability for SmartSolo experiments does not appear to be consistent with expectations (ie continuous recording) or the windows listed in the DAS table, and is instead skipping some of those windows. For validated archives at the DMC, I have made queries both on the PIC production server and PH5WS. As far as I can determine, PH5WS availability behaves the same to the level of seconds (but not necessarily to all fractions of a second) as the PIC production server ph5availability query.

I will present multiple examples in this issue, using both PIC production server queries and PH5WS availability queries on three different SmartSolo experiments, and using availability queries that are for the entire deployment of a station and for a window of a few hours. For the sake of space, the examples in the issue are availability queries for one channel of a few hours from an example station's deployment. For the same reason, screenshots of part of the DAS table are limited in length and show lines that represent the maximum number of windows. Availability results for the entire deployment and all channels of each example station are attached as text files below. I have used some markdown options in an attempt to improve readability of this relatively long issue.

Environment (please complete the following information):

To Reproduce (Examples and Screenshots)

Experiment 21-001 (ZV.2020)

This experiment used both Fairfields and SmartSolos, and I have included a ph5availability example from one station of each type.

  1. Fairfield, Station 2001

    • PIC server: ph5availability -n master.ph5 --station 2001 -s 2020-09-30T04:00:00 -e 2020-09-30T08:00:00 -c GP1 -S -a 2 -f t
      • n s l c q sample-rate earliest latest

      • ZV 2001 -- GP1 1000.0 2020-09-30T04:00:00.000000Z 2020-09-30T08:00:00.000000Z

Here is a screenshot showing part of the DAS table for this station. 21-001_stn2001_DAStable

  1. SmartSolo, Station 2002

Here is a screenshot showing part of the SmartSolo DAS table: it shows 30 minute (see sample count and sample rate columns) windows for part of the requested time period, including a 30 minute window starting at 05:53:45 that is not shown in the availability output. 21-001_stn2002_DAStable

Experiment 21-002 (9N.2020)

This experiment used exclusively SmartSolos, and redeployed some nodes to a second station. The examples used are from a node that was redeployed and both stations belong to the same DAS table.

  1. Initial deploy, Station 8162

The screenshot shows part of the DAS table; note there is a window starting at 07:44:18 that is not represented in the availability output. 21-002_stn8162_DAStable

  1. Redeployment, Station C7035 (id_s=7035)

    • PIC server: ph5availability -n master.ph5 --station C7035 -s 2020-11-14T16:00:00 -e 2020-11-14T20:00:00 -c GP1 -S -a 2 -f t
      • n s l c q sample-rate earliest latest

      • 9N C7035 -- GP1 1000.0 2020-11-14T16:00:00.000000Z 2020-11-14T16:00:00.031998Z
      • 9N C7035 -- GP1 1000.0 2020-11-14T16:00:00.032000Z 2020-11-14T16:30:00.032999Z
      • 9N C7035 -- GP1 1000.0 2020-11-14T17:00:00.033999Z 2020-11-14T17:30:00.034999Z
      • 9N C7035 -- GP1 1000.0 2020-11-14T17:30:00.035000Z 2020-11-14T18:30:00.036999Z
      • 9N C7035 -- GP1 1000.0 2020-11-14T19:00:00.038000Z 2020-11-14T20:00:00.000000Z

This screenshot shows there are windows starting at 16:30:00 and 18:30:00 that are not in the availability output. 21-002_stnC7035_DAStable

Experiment 21-014 (ZV.2021)

This experiment used exclusively SmartSolos, and deployed almost all nodes to three different stations. The examples used are from a node that was deployed to three different stations, and all stations belong to the same DAS table.

  1. First deployment, station 1002
    • PIC server: ph5availability -n master.ph5 --station 1002 -s 2021-05-13T12:00:00 -e 2021-05-13T16:00:00 -c GP1 -S -a 2 -f t
      • n s l c q sample-rate earliest latest

      • ZV 1002 -- GP1 2000.0 2021-05-13T12:00:00.000000Z 2021-05-13T12:00:00.000499Z
      • ZV 1002 -- GP1 2000.0 2021-05-13T12:30:00.000000Z 2021-05-13T13:00:00.000499Z
      • ZV 1002 -- GP1 2000.0 2021-05-13T13:30:00.000000Z 2021-05-13T14:00:00.000499Z
      • ZV 1002 -- GP1 2000.0 2021-05-13T14:30:00.000000Z 2021-05-13T15:00:00.000499Z
      • ZV 1002 -- GP1 2000.0 2021-05-13T15:30:00.000000Z 2021-05-13T16:00:00.000000Z

Screenshot showing part of the DAS table for this time period: 21-014_stn1002_DAStable

  1. Second deployment, Station 6602
    • PIC server: ph5availability -n master.ph5 --station 6602 -s 2021-05-18T06:00:00 -e 2021-05-18T10:00:00 -c GP1 -S -a 2 -f t
      • n s l c q sample-rate earliest latest

      • ZV 6602 -- GP1 2000.0 2021-05-18T06:00:00.000000Z 2021-05-18T06:00:00.000499Z
      • ZV 6602 -- GP1 2000.0 2021-05-18T06:30:00.000000Z 2021-05-18T07:00:00.000499Z
      • ZV 6602 -- GP1 2000.0 2021-05-18T07:30:00.000000Z 2021-05-18T08:00:00.000499Z
      • ZV 6602 -- GP1 2000.0 2021-05-18T08:30:00.000000Z 2021-05-18T09:00:00.000499Z
      • ZV 6602 -- GP1 2000.0 2021-05-18T09:30:00.000000Z 2021-05-18T10:00:00.000000Z

Screenshot showing part of the DAS table for this time period: 21-014_stn6602_DAStable

  1. Third deployment, Station 7076
    • PIC server: ph5availability -n master.ph5 --station 7076 -s 2021-05-20T10:00:00 -e 2021-05-20T14:00:00 -c GP1 -S -a 2 -f t
      • n s l c q sample-rate earliest latest

      • ZV 7076 -- GP1 2000.0 2021-05-20T10:00:00.000000Z 2021-05-20T10:04:22.000499Z
      • ZV 7076 -- GP1 2000.0 2021-05-20T10:34:22.000000Z 2021-05-20T11:04:22.000499Z
      • ZV 7076 -- GP1 2000.0 2021-05-20T11:34:22.000000Z 2021-05-20T12:04:22.000499Z
      • ZV 7076 -- GP1 2000.0 2021-05-20T12:34:22.000000Z 2021-05-20T13:04:22.000499Z
      • ZV 7076 -- GP1 2000.0 2021-05-20T13:34:22.000000Z 2021-05-20T14:00:00.000000Z

Screenshot showing part of the DAS table for this time period: 21-014_stn7076_DAStable

Expected behavior All timespans within the DAS tables and within metadata, and within any time or other limits specified by the query, should be displayed by ph5availability.

Please let me know if providing additional information, such as stations from a different DAS table in one or more of the example experiments, would be helpful.

21-001_stn2001_availability.txt 21-001_stn2001_ph5ws-availability_2022-02-10T155151Z.txt 21-001_stn2002_availability.txt 21-001_stn2002_ph5ws-availability_2022-02-10T155311Z.txt 21-002_stn8162_availability.txt 21-002_stn8162_ph5ws-availability_2022-02-10T161530Z.txt 21-002_stnC7035_availability.txt 21-002_stnC7035_ph5ws-availability_2022-02-10T161704Z.txt 21-014_stn1002_availability.txt 21-014_stn6602_availability.txt 21-014_stn7076_availability.txt

hrotman-pic commented 2 years ago

As requested, examples from the rebuild of 21-002, starting with the short examples in the original issue. This rebuild was started in late July, using new SEGD files written out in July.

Station 8162: ph5availability -n master.ph5 --station 8162 -s 2020-11-08T06:00:00 -e 2020-11-08T10:00:00 -c GP1 -S -a 2 -f t

n s l c q sample-rate earliest latest

9N 8162 -- GP1 1000.0 2020-11-08T06:00:00.000000Z 2020-11-08T06:00:00.011999Z 9N 8162 -- GP1 1000.0 2020-11-08T06:00:00.012000Z 2020-11-08T07:30:00.014999Z 9N 8162 -- GP1 1000.0 2020-11-08T07:30:00.015000Z 2020-11-08T09:30:00.018999Z 9N 8162 -- GP1 1000.0 2020-11-08T09:30:00.019000Z 2020-11-08T10:00:00.000000Z

Station C7035: ph5availability -n master.ph5 --station C7035 -s 2020-11-14T16:00:00 -e 2020-11-14T20:00:00 -c GP1 -S -a 2 -f t

n s l c q sample-rate earliest latest

9N C7035 -- GP1 1000.0 2020-11-14T16:00:00.000000Z 2020-11-14T16:00:00.031999Z 9N C7035 -- GP1 1000.0 2020-11-14T16:00:00.032000Z 2020-11-14T17:30:00.034999Z 9N C7035 -- GP1 1000.0 2020-11-14T17:30:00.035000Z 2020-11-14T19:00:00.037999Z 9N C7035 -- GP1 1000.0 2020-11-14T19:00:00.038000Z 2020-11-14T20:00:00.000000Z

The availability windows are different lengths, but all the 30 minute windows are shown.

This was encouraging, and I requested full availability output. Here are some examples from station 8162 where availability is still not showing all expected times: 9N 8162 -- GP1 1000.0 2020-11-07T22:42:00.009000Z 2020-11-08T00:12:00.011999Z 9N 8162 -- GP1 1000.0 2020-11-08T00:30:00.000999Z 2020-11-08T01:00:00.001999Z

9N 8162 -- GP1 1000.0 2020-11-08T22:30:00.045000Z 2020-11-09T00:00:00.047999Z 9N 8162 -- GP1 1000.0 2020-11-09T00:30:00.000999Z 2020-11-09T01:00:00.001999Z

I checked some other stations and the existence of a gap at the day boundary appears to be consistent. However, there is no gap at the day boundary in the DAS tables, and ph5toms extracts data for these times: msi -S *GP1* DCC|2022,223 9N|8162||GP1|2020,312,18:12:00.000000|2020,313,00:12:00.011000||1000|21600012|||||||2022,223 9N|8162||GP1|2020,313,00:00:00.000000|2020,314,00:00:00.047000||1000|86400048|||||||2022,223 9N|8162||GP1|2020,314,00:00:00.000000|2020,315,00:00:00.047000||1000|86400048|||||||2022,223 9N|8162||GP1|2020,315,00:00:00.000000|2020,316,00:00:00.047000||1000|86400048|||||||2022,223 9N|8162||GP1|2020,316,00:00:00.000000|2020,316,16:29:59.999000||1000|59400000|||||||2022,223

I think the (commonly) small overlap at the SEGD day file boundaries is causing the missing time in ph5availability. I'm also not sure what to think about the fact that some 30 minute windows are combined in availability output, but not always the same number of windows: it looks like it can be a window alone, or 2-4 windows combined, but that is probably(?) minor and I'm not sure it's actually a problem.

emilylmaher commented 1 year ago

Ran the following code and got the same output as Holly got previously. ph5availability -n master.ph5 --station 8162 -s 2020-11-08T06:00:00 -e 2020-11-08T10:00:00 -c GP1 -S -a 2 -f t

9N 8162   -- GP1        1000.0 2020-11-08T06:00:00.000000Z 2020-11-08T06:00:00.011999Z
9N 8162   -- GP1        1000.0 2020-11-08T06:00:00.012000Z 2020-11-08T07:30:00.014999Z
9N 8162   -- GP1        1000.0 2020-11-08T07:30:00.015000Z 2020-11-08T09:30:00.018999Z
9N 8162   -- GP1        1000.0 2020-11-08T09:30:00.019000Z 2020-11-08T10:00:00.000000Z

Can Holly advise if this is expected behavior?