NASA-IMPACT / csdap-cumulus

SmallSat Cumulus Deployment
Other
1 stars 0 forks source link

Ingest WV02_Pan_L1B granules into Production from MCP #349

Open jsrikish opened 3 months ago

jsrikish commented 3 months ago

Ingest granules in collection WV02_Pan_L1B to CBA Prod by discovering/ingesting from MCP account.

Acceptance criteria

To determine how many granules have been processed, first enter the Docker container:

DOTENV=.env.cba-prod make bash

In the container, run the following:

DEBUG=1 cumulus granules list -? collectionId=WV02_Pan_L1B___1 --limit=0 -? status=completed

(note: due to a Cumulus bug, sometimes the status does not get properly updated. Try running these to match the numbers)

DEBUG=1 cumulus granules list -? collectionId=WV02_Pan_L1B___1 --limit=0
DEBUG=1 cumulus granules list -? collectionId=WV02_Pan_L1B___1 --limit=0 -? status=queued
DEBUG=1 cumulus granules list -? collectionId=WV02_Pan_L1B___1 --limit=0 -? status=running
DEBUG=1 cumulus granules list -? collectionId=WV02_Pan_L1B___1 --limit=0 -? status=completed
DEBUG=1 cumulus granules list -? collectionId=WV02_Pan_L1B___1 --limit=0 -? status=failed

You should see output similar to the following:

...
RESPONSE: {
  statusCode: 200,
  body: '{"meta":{"name":"cumulus-api","stack":"cumulus-prod","table":"granule","limit":0,"page":1,"count":8592},"results":[]}',
  headers: {
    'x-powered-by': 'Express',
    'access-control-allow-origin': '*',
    'strict-transport-security': 'max-age=31536000; includeSubDomains',
    'content-type': 'application/json; charset=utf-8',
    'content-length': '114',
    etag: 'W/"72-O2wUXhu+Q9J1hqdDrb0fcsZeFHo"',
    date: 'Fri, 01 Dec 2023 21:29:19 GMT',
    connection: 'close'
  },
  isBase64Encoded: false
}
[]

In particular, look at the value for body and within it, locate the value of "count". In the output above, the count should match the Earthdata Search granule count obtained in the very first step.

jsrikish commented 3 weeks ago

My discussion with Abdelhak, Madhu, Vishal, Brad and Helen: there were some SM2A restore errors and as a result cmr.json was not created Vishal, Brad and Abdelhak fixed those errors and I changed the rule to ingest just 2009 (Data starts in Nov) Earthdata shows 4995 granules DB Table has 5019 granules there were some granules copied over to S3 before I got to CSDA (late 2021 and early 2022) for which checksums were not calculated and as a result missing DB entries Total no. of OBJECTS in S3 for 2009/1B/P1BS -- 46,006 (objects)-- half of the objects (22,097) have DB entries
checksum calculation with DB insertion to be done for the other half (23,909)--these nos. are not exactly divisible by 4 to compute the # of granules since some of the granules have (.rename) object added to it which is irrelevant Helen checked the metadata tags for 2009 --gave us the go-ahead to ingest the rest