nasa / opera-sds-pcm

Observational Products for End-Users from Remote Sensing Analysis (OPERA)
Apache License 2.0
16 stars 12 forks source link

[Bug]: Unknown CSLC Query behavior when a frame has no sensing_datetimes in DISP-S1 DB #977

Open philipjyoon opened 3 weeks ago

philipjyoon commented 3 weeks ago

Checked for duplicates

Yes - I've already checked

Describe the bug

Traceback (most recent call last): File "/home/ops/verdi/ops/opera-pcm/data_subscriber/daac_data_subscriber.py", line 319, in <module> main() File "/home/ops/verdi/ops/opera-pcm/util/exec_util.py", line 35, in wrapper status = func(*args, **kwargs) File "/home/ops/verdi/ops/opera-pcm/data_subscriber/daac_data_subscriber.py", line 56, in main run(sys.argv) File "/home/ops/verdi/ops/opera-pcm/data_subscriber/daac_data_subscriber.py", line 99, in run results["query"] = run_query(args, token, es_conn, cmr, job_id, settings) File "/home/ops/verdi/ops/opera-pcm/data_subscriber/daac_data_subscriber.py", line 137, in run_query return cmr_query.run_query(args, token, es_conn, cmr, job_id, settings) File "/home/ops/verdi/ops/opera-pcm/data_subscriber/query.py", line 50, in run_query granules = self.query_cmr(args, token, cmr, settings, query_timerange, now) File "/home/ops/verdi/ops/opera-pcm/data_subscriber/cslc/cslc_query.py", line 429, in query_cmr self.extend_additional_records(all_granules) File "/home/ops/verdi/ops/opera-pcm/data_subscriber/cslc/cslc_query.py", line 80, in extend_additional_records parse_cslc_native_id(granule_id, self.burst_to_frames, self.disp_burst_map_hist)) File "/home/ops/verdi/ops/opera-pcm/data_subscriber/cslc_utils.py", line 348, in parse_cslc_native_id acquisition_cycles[frame_id] = determine_acquisition_cycle_cslc(acquisition_dts, frame_id, frame_to_bursts) File "/home/ops/verdi/ops/opera-pcm/data_subscriber/cslc_utils.py", line 172, in determine_acquisition_cycle_cslc day_index, seconds = sensing_time_day_index(acquisition_dts, frame_number, frame_to_bursts) File "/home/ops/verdi/ops/opera-pcm/data_subscriber/cslc_utils.py", line 85, in sensing_time_day_index return (_calculate_sensing_time_day_index(sensing_time, frame.sensing_datetimes[0])) IndexError: list index out of range

This error happened because when the current version of DISP-S1 database json was created, the update tool did not find all bursts for frame 28471 that the original claimed it had and so it didn't mark any sensing times for that frame. This is the issue that ADT (Scott S) and I've been trying to resolve. This current version of database json was created by a tool that I wrote; but going forward ADT will take back that responsibility (decided last Friday) providing we give them the entire list of CSLC files from CMR (which I'm working on but running into CMR issues, you've been CCed)

So once we have an updated DISP-S1 database json, this will be a non-issue.

What did you expect?

Actually not sure. A frame in the in DISP-S1 database having no sensing_datetime is an impossibility: if it were true, that frame does not belong in the database at all. Probably best to write out a meaningful error to the user but no behavior change from current.

Reproducible steps

.py', 'query', '--collection-shortname=OPERA_L2_CSLC-S1_V1', '--endpoint=OPS', '--start-date=2024-09-03T08:01:09Z', '--end-date=2024-09-03T09:01:09Z', '--release-version=3.1.0-rc.5.0', '--job-queue=opera-job_worker-cslc_data_download', '--chunk-size=1', '--k=5', '--m=1', '--grace-mins=120', '--max-revision=1000', '--temporal-start-date=2024-08-04T09:01:09Z', '--transfer-protocol=auto' using the database file created mid-August 2024

This error can be reduced to the following:
```burst_id, acquisition_dts, acquisition_cycles, frame_ids = \
        cslc_utils.parse_cslc_native_id(
            "OPERA_L2_CSLC-S1_T107-227769-IW3_20240902T003138Z_20240903T073341Z_S1A_VV_v1.1", burst_to_frames,
            disp_burst_map_hist)```

Environment

- Version of this software [e.g. vX.Y.Z]
- Operating System: [e.g. MacOSX with Docker Desktop vX.Y]
...