Open frankinspace opened 3 months ago
@ymchenjpl discovered that the old SNS response topic was removed during deployment of bignbit 0.1.1, causing no responses to be received from GIBS.
Took corrective action in ops:
svc-pobit-podaac-ops-cumulus-gibs-response-topic
in OPS AWS Console. Confirmed permissions are set that allows GIBS to publish to that topicsvc-bignbit-podaac-ops-cumulus-gibs-response-queue
to the manually created svc-pobit-podaac-ops-cumulus-gibs-response-topic
This should restore bignbit operations now.
Then, next steps would be:
Update GIBS ICD to change the response topic from svc-pobit-podaac-ops-cumulus-gibs-response-topic
to the new svc-bignbit-podaac-ops-cumulus-gibs-response-topic
Once GIBS updates to the new topic and we confirm we are still receiving the responses, remove the manually created svc-pobit-podaac-ops-cumulus-gibs-response-topic
GIBS ops has reported
Last successful pull from our side was 7/31/24 at 06:52:26 (local time EDT). And nothing since
Still have not received any responses from GIBS after re-establishing the correct response topic which indicates there is another problem going on.
May need to consider rolling back bignbit update.
Plan is to roll back to big v0.3.3 and pobit v0.4.1 in UAT and retry sending an OPERA granule to GIBS UAT. If that works we can also rollback ops, if it doesn't we will need further debugging.
Rollback(PR) big v0.3.3 & pobit v0.4.1 https://github.jpl.nasa.gov/podaac/cumulus-deploy-tf/pull/360
PO.DAAC has re-deployed UAT venue with the v0.3.3 BIG and v0.4.1 POBIT components as a dry-run for fixing the OPS venue.
3 Opera OPERA_L3_DSWX-HLS_V1 granules to sent to GIBS UAT.
OPERA_L3_DSWx-HLS_T01FBE_20240727T215911Z_20240803T144035Z_S2A_30_v1.0
OPERA_L3_DSWx-HLS_T01KAA_20240730T220003Z_20240804T020052Z_L8_30_v1.0
OPERA_L3_DSWx-HLS_T01KBU_20240731T221939Z_20240802T034200Z_S2B_30_v1.0
GIBS confirmed they processed the following in UAT
OPERA_L3_DSWx-HLS_T01KBU_001063_20240731T221939Z_20240802T034200Z_S2B_30_v1.0_BROWSE_2024213
OPERA_L3_DSWx-HLS_T01KBU_002063_20240731T221939Z_20240802T034200Z_S2B_30_v1.0_BROWSE_2024213
OPERA_L3_DSWx-HLS_T01KAA_320064_20240730T220003Z_20240804T020052Z_L8_30_v1.0_BROWSE_2024212
OPERA_L3_DSWx-HLS_T01KBU_002064_20240731T221939Z_20240802T034200Z_S2B_30_v1.0_BROWSE_2024213
OPERA_L3_DSWx-HLS_T01KAA_001064_20240730T220003Z_20240804T020052Z_L8_30_v1.0_BROWSE_2024212
OPERA_L3_DSWx-HLS_T01FBE_319036_20240727T215911Z_20240803T144035Z_S2A_30_v1.0_BROWSE_2024209
OPERA_L3_DSWx-HLS_T01KAA_001065_20240730T220003Z_20240804T020052Z_L8_30_v1.0_BROWSE_2024212
OPERA_L3_DSWx-HLS_T01FBE_320036_20240727T215911Z_20240803T144035Z_S2A_30_v1.0_BROWSE_2024209
OPERA_L3_DSWx-HLS_T01KAA_320065_20240730T220003Z_20240804T020052Z_L8_30_v1.0_BROWSE_2024212
OPERA_L3_DSWx-HLS_T01KBU_001064_20240731T221939Z_20240802T034200Z_S2B_30_v1.0_BROWSE_2024213
OPERA_L3_DSWx-HLS_T01FBE_001036_20240727T215911Z_20240803T144035Z_S2A_30_v1.0_BROWSE_2024209
And PO.DAAC confirmed responses were received for the 3 granules in UAT. Will gain consensus and apply the roll-back to ops.
Roll back was applied in OPS. Confirmed success of OPERA_L3_DSWx-HLS_T01FBE_20240727T215911Z_20240803T144035Z_S2A_30_v1.0
in OPS with GIBS. Count of responses returned from GIBS increased from 0 it was showing previously.
Deployed to ops on 8/6/24.
Issue was in GITC configuration, fix in place in UAT. Testing with bignbit 0.1.1 in 24.3 IP sprint via https://github.com/podaac/bignbit/issues/4
Starting with release of bignbit 0.1.1 on 2024-07-31 GIBS delivery of browse images has been interrupted resulting in no browse images being available through worldview for this collection.