NASA-PDS / nucleus

Nucleus is a software platform used to create workflows for the Planetary Data (PDS).
https://nasa-pds.github.io/nucleus
Apache License 2.0
0 stars 0 forks source link

Validate and Load all PDS4 MESSENGER data products with Nucleus #54

Open jordanpadams opened 1 year ago

jordanpadams commented 1 year ago

πŸ’‘ Description

tloubrieu-jpl commented 1 year ago

@ramesh-maddegoda focuses on MSGRMDS_4001 and MESSDEM_1001 that need to be loaded in the registry first so that they can be used in ticket https://github.com/NASA-PDS/search-api-notebook/issues/24

tloubrieu-jpl commented 1 year ago

Blocked because AWS Airflow is unavailable on NGAP

tloubrieu-jpl commented 11 months ago

Unblocked since Ramesh work on MCP. He is now testing the ECS task called by the nucleus workflow.

jordanpadams commented 11 months ago

Status: @ramesh-maddegoda working on improving Terraform deployments

tloubrieu-jpl commented 11 months ago

@ramesh-maddegoda is deploying everything needed on MCP, from scratch.

tloubrieu-jpl commented 8 months ago

@ramesh-maddegoda will test nucleus to validate its robustness with a bigger dataset.

ramesh-maddegoda commented 7 months ago

Some of the files in the s3://asc-pds-messenger failed to copy to the PDS Nucleus staging bucket with a permission issue.

aws s3 cp s3://asc-pds-messenger/MSGRMDS_8001/RTM/MDIS_RTM_N01/2013_228/MDIS_RTM_N01_006974_4644396_1.IMG s3://pds-nucleus-staging/messenger-data/MSGRMDS_8001/RTM/MDIS_RTM_N01/2013_228/
copy failed: s3://asc-pds-messenger/MSGRMDS_8001/RTM/MDIS_RTM_N01/2013_228/MDIS_RTM_N01_006974_4644396_1.IMG to s3://pds-nucleus-staging/messenger-data/MSGRMDS_8001/RTM/MDIS_RTM_N01/2013_228/MDIS_RTM_N01_006974_4644396_1.IMG An error occurred (AccessDenied) when calling the GetObjectTagging operation: Access Denied
tloubrieu-jpl commented 7 months ago

A new parameter enable to copy all the metadata .

tloubrieu-jpl commented 7 months ago

@ramesh-maddegoda identified a bug while doing that test. The lambda reading the data sync report is now taking more that 15 minutes. Now there will be a single lambda call per report.

tloubrieu-jpl commented 7 months ago

The upgrade worked on a small dataset and @ramesh-maddegoda is now testing on the messenger dataset.

tloubrieu-jpl commented 6 months ago

Now Ramesh is loading data to the registry on JPL AWS. Last step for this task.

nutjob4life commented 6 months ago

20,000 processed! 8 directories ran! Found 2 errors:

tloubrieu-jpl commented 6 months ago
tloubrieu-jpl commented 6 months ago

@ramesh-maddegoda is experimenting with SQS to send to new records to the mysql database and avoid the time out he was experiencing with direct insertion.

tloubrieu-jpl commented 6 months ago

SQS now mostly works, but another lambda had a time out.

tloubrieu-jpl commented 6 months ago

We now integrate the copy from S3 to EFS as a nucleus step in the DAGs. We give up DataSync which comes with risks of overlapping copies and complication to remove files from EFS.

@ramesh-maddegoda will also write a note in a wiki for a future design where we don't need to use EFS at all.

jordanpadams commented 5 months ago

This work has been paused as we focus on Catalina Sky Survey. Will move to B15.0 release plan to complete work.

jordanpadams commented 3 months ago

πŸ“† 05/2024 status: Delayed several sprints due to delays in https://github.com/NASA-PDS/nucleus/issues/93. This is an operations activity. No impact on build.

jordanpadams commented 2 months ago

πŸ“† 06/2024 status: Delayed several sprints due to delays in https://github.com/NASA-PDS/nucleus/issues/93. This is an operations activity. No impact on build.

jordanpadams commented 1 month ago

πŸ“† 07/2024 status: Delayed several sprints due to delays in https://github.com/NASA-PDS/nucleus/issues/93. This is an operations activity. No impact on build.

jordanpadams commented 1 week ago

πŸ“† 08/2024 status: Delayed several sprints due to delays in https://github.com/NASA-PDS/nucleus/issues/93. This is an operations activity. No impact on build. Will most likely be deferred to B15.1