NASA-PDS / nucleus

Nucleus is a software platform used to create workflows for the Planetary Data (PDS).
https://nasa-pds.github.io/nucleus
Apache License 2.0
0 stars 0 forks source link

Prepare product labels as batches to be processed by Nucleus #75

Closed ramesh-maddegoda closed 10 months ago

ramesh-maddegoda commented 11 months ago

Initially it was decided to present PDS product-by-product to Nucleus to be processed. However, when processing larger data sets with hundreds or thousands of PDS product labels, this can lead to launch a large number of ECS tasks and docker containers, which is both time consuming and resource consuming.

It is required to implement a way to create batches of PDS product labels and present those batches as manifest files to Nucleus to be processed.

tloubrieu-jpl commented 11 months ago

@ramesh-maddegoda works now with batches of 10 products which works for validate and harvest for messenger data.

@ramesh-maddegoda will use the manifest supported by both validate and harvest for a list of products.

tloubrieu-jpl commented 10 months ago

Done but PR is not ready yet.

ramesh-maddegoda commented 10 months ago

The pull request is available at https://github.com/NASA-PDS/nucleus/pull/78