nasa / opera-sds-pcm

Observational Products for End-Users from Remote Sensing Analysis (OPERA)
Apache License 2.0
16 stars 12 forks source link

Add more debugging information for DISP-S1 triggering jobs #798

Open hhlee445 opened 7 months ago

hhlee445 commented 7 months ago

Checked for duplicates

Yes - I've already checked

Alternatives considered

No - I haven't considered

Related problems

No response

Describe the feature request

We want to add following information to the logs for CSLC query/download jobs to trigger DISP-S1 sciflo jobs.

For a given query, what inputs were seen. We have this in GRQ, but I think having that in the logs would also be useful, especially if it’s easily searchable. E.g. for the FWD queries, this would show the inputs with revision IDs in the query window. For a given query, which Frame were considered (based on the inputs seen from the previous point) Also list out which Frame were triggered for PGE jobs, along with the reason (e.g. 100% of bursts found, or X% bursts found, and we reached the wait time threshold) Also list out which Frame were NOT triggered for PGE jobs, along with the reason Reasons would include things like: “only x% bursts found (less than the X% threshold)“, or “x% of bursts found (surpassing X% threshold, but <100%), but waiting n more hours (out of N)“, etc.

philipjyoon commented 3 months ago

@sjlewis-jpl Do we also want the list of CSLC granules found for "k" purposes? For example, we if we processing with k=15, we will end up getting 27 * 15 CSLC granules from CMR. This would happen for every frame that we decide to trigger. Or do we want to suppress granules information for k purposes and print out granule information solely for ones that were being considered for triggering?

sjlewis-jpl commented 3 months ago

Very good question Phil. We definitely want the list used when considering triggering, which I expect will allow us to more effectively debug issues with the trigger logic.

Getting all k dates for each burst recorded in the log... I'm not opposed to it, but it's not obvious to me how we will use that information. If you think it's useful, can we record it on a separate line or place from the list used to consider triggering? I think having easy access to those used for triggering will be important.

philipjyoon commented 3 months ago

Steven, the current state of code is that the lower-level CMR code prints out every granule returned from CMR. k granules are retrieved from CMR the same way all granules are retrieved and therefore they are currently being output to the logs. And it looks very busy.

I've attached the current state of logs, this is using [k=4] let me know what you think.

philipjyoon commented 3 months ago

DISP-S1-query-log-k4.txt