NASA-IMPACT / hls-lpdaac-reconciliation

0 stars 0 forks source link

Add granule and file count details to log messages from the response handler #6

Open chuckwondo opened 2 weeks ago

chuckwondo commented 2 weeks ago

A recent reconciliation report notification from LPDAAC contained this message body:

{'HLSL30___2.0': 'Sent: 74160. Total diff: 0 = missing: 0 + failed: 0 + other: 0', 'HLSS30___2.0': 'Sent: 115773. Total diff: 42 = missing: 0 + failed: 42 + other: 0'}

Discrepencies found comparing the report at lp-prod-reconciliation/submissions/reconciliation_reports/2024305/HLS_reconcile_2024305_2.0.rpt with our database. Report available at lp-prod-reconciliation/reports/HLS_reconcile_2024305_2.0.json.

The corresponding log messages written by the response lambda were as follows:

Subject: Rec-Report HLS lp-prod HLS_reconcile_2024305_2.0.rpt
Reading report from s3://lp-prod-reconciliation/reports/HLS_reconcile_2024305_2.0.json
0 missing from HLSL30___2.0
2 missing from HLSS30___2.0
Processing summary: {'HLSL30___2.0': {}, 'HLSS30___2.0': {<Status.TRIGGERED: 'triggered'>: ['HLS.S30.T60HTG.2024303T222539.v2.0', 'HLS.S30.T60HUB.2024303T222539.v2.0']}}

The log messages do not make it clear that we have correctly handled the response, which can lead to unnecessary investigation to determine the state of affairs.

In order to avoid such unnecessary investigation time, we'd like to see the log messages enhanced by replacing each "X missing from ..." message with "X granule (Y file) differences in ...".

Currently, X is the number of granules, so the 2 "missing" from S30 are 2 "missing" (differing) granules, which equates to the 42 "Total diff" indicated in the notification message, but that is not at all clear in the current log messages.

The proposed change would not only allow us to readily see if Y matches up with the total diff per collection, but also eliminates confusion over the term "missing" in the log messages, which does not correspond to the missing count given in the notification message.

chuckwondo commented 2 weeks ago

The current message is printed from the first line of the function process_collection within src/hls_lpdaac_reconciliation/response/index.py.

chuckwondo commented 2 weeks ago

The current message is printed from the first line of the function process_collection within src/hls_lpdaac_reconciliation/response/index.py.

It might make sense to change the log messages in a different function because in process_collection we no longer have the full list of files, so we cannot determine Y (as described in the initial comment).

We need access to the full report as available either within the handler or process_report functions.