Open gracebato opened 1 month ago
Parameters should be similar to https://github.com/nasa/opera-sds/issues/53, e.g.
Date range: 20160701 - 20240905
k=15
m=5
@gracebato just to confirm: You'd like these products delivered to ASF UAT correct? #53 did not request that.
Also does it make a difference if we process these on our INT venue instead of PST? The difference is that if we process on PST we will keep the products in PST S3 forever; wherever INT S3 will be deleted clean every deployment. So the question is: do these products need to be archived in PST S3 forever or just delivering to ASF UAT sufficient?
EDIT: After speaking with @LucaCinquini we decided that we will process on the PST venue and deliver to ASF UAT.
We want to make sure that all frames process at least 4 years worth of data. So I'll process 2016-2021 covering 5 years. And if any of the frame still don't have 4 years worth of data, which is possible, I can extend the time range of those specific frames until that point.
This is a bit of manual work but not too bad. I can predetermine which frames would not have 4 years worth of data in the first 5 calendar years using the historical database.
Hi @philipjyoon all DISP-S1 products goes to UAT for all produced going forward. So request https://github.com/nasa/opera-sds/issues/53 would also go to UAT. Thanks.
This request will be executed in 2 variations with one dependency. @gracebato please correct me if the understanding is incorrect:
3.1.0-rc.6.0
(latest version as of today) process these frames for at least 4 years. Product version is still v0.6
3.1.0-rc.7.0
is released next week, first run the 3 frames in request #53 using product version v0.7
for 2016-2024Will use the following batch_proc for Variation 1
{
"enabled": true,
"label": "PST_Request_55",
"processing_mode": "historical",
"include_regions": "",
"exclude_regions": "",
"temporal": true,
"data_start_date": "2016-07-01T00:00:00",
"data_end_date": "2021-02-01T00:00:00",
"k": 15,
"m": 6,
"frames": [8622, 9156, 12640, 18903, 28486, 33039, 33065, 36542, 42779],
"wait_between_acq_cycles_mins": 10,
"job_type": "cslc_query_hist",
"provider_name": "ASF",
"job_queue": "opera-job_worker-cslc_data_query_hist",
"download_job_queue": "opera-job_worker-cslc_data_download_hist",
"chunk_size": 1
}
~65% complete as of now.
f28486
has finished. f33039
is taking by far the longest... it's currently only 35% complete processing around 2018 right now.
80% complete. The rate is about 1% per hour
86% complete. There was some sort of JPL-wide network issue between last night and this morning. It seems to have just resolved and we've resumed processing.
frame_completion_percentages ['33039: 62%', '9156: 82%', '8622: 94%', '28486: 100%', '36542: 90%', '18903: 88%', '33065: 89%', '12640: 99%', '42779: 87%']
last_processed_datetimes {'33039': '2019-08-07T04:30:10', '9156': '2020-08-01T02:07:32', '8622': '2020-11-04T22:51:11', '28486': '2021-01-21T00:36:31', '36542': '2020-10-07T01:59:19', '18903': '2020-09-26T13:51:31', '33065': '2020-10-12T04:39:58', '12640': '2021-01-04T23:28:50', '42779': '2020-09-26T16:13:06'}
progress_percentage 86%
Logging into the SCIFLO verdi machines and manually killing cloudwatch agent service which uses up one whole CPU core. This frees up the CPU core for the actual DISP-S1 processing. Next OPERA PCM release have a fix for this cloudwatch agent inefficiency.
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -m ec2 -a stop
Processing is complete. Here is the listing of all products
Due to an operator error getting around a database file issue (which is now fixed going forward) we need to reprocess the last 4 runs of frame 8622 starting w the query job that generated the following batch ids:
f8622_a1032 f8622_a1020 f8622_a996 f8622_a984 f8622_a972 f8622_a960 f8622_a948 f8622_a936 f8622_a924 f8622_a912 f8622_a900 f8622_a888 f8622_a864 f8622_a852 f8622_a840
To do this, we will need to perform the following actions:
grq_1_l2_cslc_s1_compressed
that were generated by the SCIFLO runs that we are going to re-run."frame_states": {"8622": 60}
and data_start_date
can be the original start date in 2016. This way the processing will happen with sensing dates 61 to 75 in the frame 8622 series{
"enabled": true,
"label": "PST_Request_55_partial_8622",
"processing_mode": "historical",
"include_regions": "",
"exclude_regions": "",
"temporal": true,
"data_start_date": "2016-07-01T00:00:00",
"data_end_date": "2021-02-01T00:00:00",
"k": 15,
"m": 6,
"frames": [8622],
"frame_states": {"8622": 60},
"wait_between_acq_cycles_mins": 5,
"job_type": "cslc_query_hist",
"provider_name": "ASF",
"job_queue": "opera-job_worker-cslc_data_query_hist",
"download_job_queue": "opera-job_worker-cslc_data_download_hist",
"chunk_size": 1
}
Reprocessing of frame 8622 last 4 runs have started
Had to stop and restart because I hadn't deleted the compressed CSLCs from those 4 incorrect runs.
We can use Tosca to delete unwanted Compressed CSLC records. In this case we want to delete all C-CSLC products that have the reference date 20181103T000000Z
Reprocessing of the last 4 runs of frame 8622
was successful. Below is the corrected listing of all products from this run.
Processing started on 10-28. Still processing as of 10-30.
Venue
PST
Product
DISP-S1
SAS Version
No response
SDS Version
No response
Input Data
High Priority Frames:
Share Results
Additional Notes
F11116
andF08882
were already processed in: https://github.com/nasa/opera-sds/issues/53