Open gracebato opened 1 month ago
Here is the regenerated database, where I've removed the time periods for each frame that Grace identified as snowy. opera-disp-s1-consistent-burst-ids-2024-10-11-2016-07-01_to_2024-09-04-unnested.json
spot checking frame 835
$ jq '."835"' < opera-disp-s1-consistent-burst-ids-2024-10-11-2016-07-01_to_2024-09-04-unnested.json | grep 2021
"2021-01-01T23:07:19",
"2021-01-13T23:07:19",
"2021-01-25T23:07:18",
"2021-03-02T23:07:17",
"2021-03-14T23:07:18",
"2021-03-26T23:07:18",
looks like it's skipping over Feb. 2021 correctly
opera-disp-s1-consistent-burst-ids-2024-10-11-2016-07-01_to_2024-09-04.json
@philipjyoon Here is the version with {"metadata": {...}, "data": {...the 'unnested' json version}}
Will be using this batch_proc
{
"enabled": true,
"label": "PST_Request_58",
"processing_mode": "historical",
"include_regions": "",
"exclude_regions": "",
"temporal": true,
"data_start_date": "2016-07-01T00:00:00",
"data_end_date": "2021-02-01T00:00:00",
"k": 15,
"m": 6,
"frames": [8622, 33065, 36542, 42779],
"wait_between_acq_cycles_mins": 5,
"job_type": "cslc_query_hist",
"provider_name": "ASF",
"job_queue": "opera-job_worker-cslc_data_query_hist",
"download_job_queue": "opera-job_worker-cslc_data_download_hist",
"chunk_size": 1
}
Processing has started.
Thanks @scottstanie both file formats are good on our side too.
Unfortunately the unnested database file contained sensing dates for HH polarization for frame 8622 and cause the query job to fail. It looks like this:
We can find the offending sensing datetime by running the following:
(mozart) hysdsops@opera-pst-mozart-fwd:~/pst_requests/request_58$ python ~/mozart/ops/opera-pcm/tools/disp_s1_burst_db_tool.py validate 8622
...
Acquisition cycle 972 of sensing time 2019-03-15 22:50:55 is good
Acquisition cycle 984 of sensing time 2019-03-27 22:50:55 is good
Acquisition cycle 996 of sensing time 2019-04-08 22:50:55 is good
Acquisition cycle 1008 is missing 27 bursts: {'T033-068977-IW3', 'T033-068974-IW1', 'T033-068974-IW2', 'T033-068971-IW3', 'T033-068976-IW2', 'T033-068977-IW1', 'T033-068976-IW1', 'T033-068977-IW2', 'T033-068969-IW2', 'T033-068975-IW1', 'T033-068975-IW2', 'T033-068975-IW3', 'T033-068969-IW3', 'T033-068973-IW3', 'T033-068970-IW3', 'T033-068974-IW3', 'T033-068972-IW2', 'T033-068972-IW3', 'T033-068973-IW1', 'T033-068971-IW1', 'T033-068970-IW1', 'T033-068971-IW2', 'T033-068969-IW1', 'T033-068973-IW2', 'T033-068972-IW1', 'T033-068970-IW2', 'T033-068976-IW3'}
Granules for acquisition cycle 1008 found: []
Acquisition cycle 1020 of sensing time 2019-05-02 22:50:56 is good
Acquisition cycle 1032 of sensing time 2019-05-14 22:50:57 is good
Acquisition cycle 1044 of sensing time 2019-05-26 22:50:58 is good
We can see that indeed for frame 8622 (one of whose burst id is T033-068977-IW3
consists of HH polarization.
So there is mismatch between what the database file is saying and what's in the CMR. We should be ignoring all HH data and the database file does not in this case. Therefore, we need to fix the database file by hand and then upload it to S3 for the query jobs to continue. There must be a bug in the code that generated the database file - once we fix that we shouldn't have this issue.
To do this, we take out the offending sensing datetime which is 2019-04-20 and then use this new file to over-write the file that's in S3
Ideally we should rename this file to something else but we're still refining our process and development and also sensitive to processing time right now. We will name the file the same so that we don't have to redeploy the settings.yaml
file which takes ~20 mins but, more importantly, forces all existing jobs to restart and waste progress.
Before and after this change to the historical database file: (note the last sensing datetime)
python ~/mozart/ops/opera-pcm/tools/disp_s1_burst_db_tool.py frame 8622 --k=15
...
K-cycle 4 ['2018-11-03T22:50:58',...'2019-05-02T22:50:56']
...
...
python ~/mozart/ops/opera-pcm/tools/disp_s1_burst_db_tool.py frame 8622 --k=15
...
K-cycle 4 ['2018-11-03T22:50:58', '...', '2019-05-14T22:50:57']
Finally, we have to submit a new daac_data_subscriber command to replace the failed job. We need to use a new end-date. After this, the historical processor will do the right thing going forward.
python data_subscriber/daac_data_subscriber.py query --collection-shortname=OPERA_L2_CSLC-S1_V1 --endpoint=OPS --start-date=2018-11-02T04:00:32Z --end-date=2019-05-15T05:00:35Z --release-version=3.1.0-rc.6.0 --job-queue=opera-job_worker-cslc_data_download_hist --chunk-size=1 --k=15 --m=6 --use-temporal --max-revision=1000 --processing-mode=historical --frame-id=8622 --transfer-protocol=auto
I forgot to perform one last step which is to restart run_disp_s1_historical_processing.py
That's because this application loads the database at the beginning of the execution and just uses it. So restarting it forces it to re-load the newly modified file.
@gracebato requested an extended run just for frame 42279
. Because we've already run this frame partially we need to create a new batch_proc with just that frame and, more importantly, transfer over the frame_state
so that we don't submit a previous job again.
{
"enabled": true,
"label": "PST_Request_58_42279",
"processing_mode": "historical",
"include_regions": "",
"exclude_regions": "",
"temporal": true,
"data_start_date": "2016-07-01T00:00:00",
"data_end_date": "2024-01-01T00:00:00",
"k": 15,
"m": 6,
"frames": [42779],
"frame_states": {"42779": 15},
"wait_between_acq_cycles_mins": 5,
"job_type": "cslc_query_hist",
"provider_name": "ASF",
"job_queue": "opera-job_worker-cslc_data_query_hist",
"download_job_queue": "opera-job_worker-cslc_data_download_hist",
"chunk_size": 1
}
Venue
PST
Product
DISP-S1
SAS Version
No response
SDS Version
PCM version 3.1.0-rc.6.0
Input Data
Process the following frames with the blackout-winter-dates database using
PCM version 3.1.0-rc.6.0
:Use similar config as
https://github.com/nasa/opera-sds/issues/55#issuecomment-2400451423
, i.e., :Share Results
Additional Notes
No response