Open mace-space opened 11 months ago
@mace-space hello and thank you for your submission! Unfortunately, there are a number of errors which must be addressed before this can be posted for the NSSDCA. I am attaching the validation report for your review; please resubmit the updated delivery package after addressing the multiple instances of the 5 errors. Thank you!
Validation report: cassini_uvis_solarocc_beckerjarmak2023_v1.0_20231221-validate.txt
As an additional note, I've noticed that you're using an older version of Validate and highly recommend upgrading to the latest version as it has the latest features and bug-fixes. Thank you!
Thanks, @c-suh
I updated Validate to the latest version, which I'm glad to have done as it spotted bugs that the older version of Validate missed (and I have re-processed the bundle to correct those table offset byte count errors).
However, after re-running pds-deep-registry-archive
, the AIP and SIP remain invalid. Looks like issue #155
The AIP and SIP labels reference an incomplete bundle LIDVID: cassini_uvis_solarocc_beckerjarmak2023::1.1
, resulting in errors:
FAIL: file:/Volumes/pdsdata-admin/data_sandbox/deep_registry/test/cassini_uvis_solarocc_beckerjarmak2023_v1.1_20240201_aip_v1.0.xml
ERROR [error.label.schematron] line 27, 25: The number of colons found in lidvid_reference: (2) is inconsistent with the number expected: (5:7).
...
...
FAIL: file:/Volumes/pdsdata-admin/data_sandbox/deep_registry/test/cassini_uvis_solarocc_beckerjarmak2023_v1.1_20240201_sip_v1.0.xml
ERROR [error.label.schematron] line 77, 25: The number of colons found in lidvid_reference: (2) is inconsistent with the number expected: (5:7).
...
...
@mace-space that is a great find on the deep archive issue! I concur and hope that the issue will be resolved soon. I will try to notify you here once it is. Thank you!
@c-suh see updated package here: Archive.zip
@jordanpadams and @mace-space this set has been posted for NSSDCA processing! From tomorrow, you can check the status at https://nssdc.gsfc.nasa.gov/psi/ReportPDS4.jsp using the SIP LID below:
SIP LID:
Thanks! @C-Suh I checked the status and SIP LIDVID: urn:nasa:pds:system_bundle:product_sip_deep_archive:cassini_uvis_solarocc_beckerjarmak2023_v1.0_20240215::1.0
failed because
Bundle located at https://pds-rings.seti.org/pds4/bundles/cassini_uvis_solarocc_beckerjarmak2023//bundle.xml does not match checksum in manifest
I think this is because I updated the bundle (to fix the issue detected by the updated version of Validate) while the pds-deep-registry-archive
tool was being patched and therefore the url (https://pds-rings.seti.org/pds4/bundles/cassini_uvis_solarocc_beckerjarmak2023) points to v1.1 (rather than v1.0) of the bundle.
Shall I try again using the updated url for v1.0 (https://pds-rings.seti.org/pds4/bundles/cassini_uvis_solarocc_beckerjarmak2023_v1.0/)? Then separately run pds-deep-registry-archive
and the steps outlined in the PDS Delivery Checklist on v1.1 of the bundle? Is the process the same when registering another version of the bundle to the deep archive or is there a different process to register an updated bundle?
Hi @mace-space! Please hold off on re-running the deep-registry-archive tool until a new, non-dev version is released (e.g., higher than v1.1.4).
I believe the process is the same when registering another version of the bundle. When creating this new bundle, however, be sure to increment the version in the VID wherever applicable.
To make sure I'm understanding correctly, would you confirm or correct the following bullet points? Thank you!
Thanks, @c-suh! I will hold off re-running pds-deep-registry-archive
until there's a new non-dev version, and will make sure I increment the version in the VID when it comes to registering v1.1
I'll respond to your points above in bold inline here:
collection_data.csv
, collection_data.xml
, bundle.xml
have also been updated.harvest
and registry-manager
) and also submitted to NSSDCA (via pds-deep-registry-archive
). Also, I wanted to double check whether that was what you meant or are you referring to another process of submission?harvest
and registry-manager
) but has not yet been submitted to the NSSDCAThanks again for all your help
@mace-space as long as the latest versions with latest paths of each bundle are loaded into the next-gen registry, you should be able to just run pds-deep-registry-archive
with each of their applicable LIDVIDs, and get the 2 accurate SIP packages:
$ pds-deep-registry-archive --site PDS_RNG urn:nasa:pds:cassini_uvis_solarocc_beckerjarmak2023::1.1
$ pds-deep-registry-archive --site PDS_RNG urn:nasa:pds:cassini_uvis_solarocc_beckerjarmak2023::1.0
@mace-space also, you should be able to upgrade your Deep Archive software and continue delivering SIP packages. Let us know if you run into any additional issues.
Thanks, here's the delivery for both v1.0 and v1.1 of urn:nasa:pds:cassini_uvis_solarocc_beckerjarmak2023
:
Delivering Node: Ring-Moon Systems
NSSDCA Delivery Package: cassini_uvis_solarocc_beckerjarmak2023_v1.0_v1.1_NSSDCA_20240228.tar.gz (for both v1.0 and v1.1)
NOTE: There were invalid urls in cassini_uvis_solarocc_beckerjarmak2023_v1.0_20240228_sip_v1.0.tab (https://pds-rings.seti.org/pds4/bundles/cassini_uvis_solarocc_beckerjarmak2023//), which I corrected to https://pds-rings.seti.org/pds4/bundles/cassini_uvis_solarocc_beckerjarmak2023_v1.0//
NOTE: As described previously, v1.0 fails validation because of table offset byte count errors that were only flagged by v3.4.1 of Validate (passed validation using older version of Validate). Would you still want v1.0 included, despite it failing validation with v3.4.1?
Let me know if you have any questions or concerns
@jordanpadams and @smclaughlin7, passing Mia's question to you:
NOTE: As described previously, v1.0 fails validation because of table offset byte count errors that were only flagged by v3.4.1 of Validate (passed validation using older version of Validate). Would you still want v1.0 included, despite it failing validation with v3.4.1?
The validation report in case it might be helpful.
In the meantime, @mace-space, the v1.1 set has been posted for NSSDCA processing! From tomorrow, you can check the status at https://nssdc.gsfc.nasa.gov/psi/ReportPDS4.jsp using the SIP LID below:
SIP LID:
NOTE: As described previously, v1.0 fails validation because of table offset byte count errors that were only flagged by v3.4.1 of Validate (passed validation using older version of Validate). Would you still want v1.0 included, despite it failing validation with v3.4.1?
@mace-space I would say yes. if the data went online, to ensure provenance of the data in the archive, even if it had some issues with it, it should go to the NSSDCA.
Note: since posting of v1.0 is to ensure provenance of the data, I am ignoring both errors found in the node's validation report (error.table.missing_LF) and in the EN's validation report (error.label.filesize_mismatch).
@mace-space the v1.0 set has also been posted for NSSDCA processing! From tomorrow, you can check the status at https://nssdc.gsfc.nasa.gov/psi/ReportPDS4.jsp using the SIP LID below:
SIP LID:
Thanks! v1.1 is in Pre-Ingest stage (some remarks about Context_Area, context products but seems to be progressing OK).
However, v1.0 is still reporting an error:
SIP LIDVID: urn:nasa:pds:system_bundle:product_sip_deep_archive:cassini_uvis_solarocc_beckerjarmak2023_v1.0_20240228::1.0
Node: PDS_RNG
Received: 2024-03-09 Failed: 2024-03-09
Remarks: Manifest checksum calculated does not match manifest checksum in SIP.
I think I need to do a similar thing as for #490's Vgr2 NSSDCA submission and re-load the data into the registry with the correct URL (https://pds-rings.seti.org/pds4/bundles/cassini_uvis_solarocc_beckerjarmak2023_v1.0 for v1.0 of this bundle), and then re-run the deep archive software?
When I run :
curl -u username 'https://search-rms-prod-etcetcetc.us-west-2.es.amazonaws.com/registry/_search?q={_id:"urn:nasa:pds:cassini_uvis_solarocc_beckerjarmak2023::1.0"}' | json_pp
it lists ops:Label_File_Info/ops:file_ref
and ops:Data_File_Info/ops:file_ref
with the v1.1 URL ( https://pds-rings.seti.org/pds4/bundles/cassini_uvis_solarocc_beckerjarmak2023) instead of the v1.0 URL ( https://pds-rings.seti.org/pds4/bundles/cassini_uvis_solarocc_beckerjarmak2023_v1.0)
@mace-space correct, as you so neatly recapped above and did for Vgr2 in #490. Thank you!
Please find v1.0 with corrected URL: cassini_uvis_solarocc_beckerjarmak2023_v1.0_NSSDCA_20240313.tar.gz
@jordanpadams and @c-suh: I wonder if there might be a larger issue here.
Whenever we archive what is the current version at the time, the URL includes the bundle name with no version number appended (e.g., pds4/bundles/cooldata
). However, when that version is superseded, the new current version takes that same URL, while the previous version now has the same URL with its version number appended (e.g., pds4/bundles/cooldata_v1.0
). This reflects how we have always managed versioning under PDS3.
Will this always require that we re-ingest any bundle at the time that it is superseded? If so, should we change our practice, so that this isn't required? Or could EN tools change so that this is no longer required? Do other nodes do things differently?
One solution might be that pds4/bundles/cooldata_v1.0
already exists even when it is the current version, and either that or pds4/bundles/cooldata
is an alias pointing to the other. Please let us know what you think.
@matthewtiscareno a few other nodes encounter this issue as well, and there is a new requirement for the registry to provide some sort of utility to allow a node to update the data path to a file, versus requiring a reload of the products to get the correct paths. https://github.com/NASA-PDS/registry/issues/266. No matter what, it will require some sort of operational intervention to know the file paths have changed, and update the paths in the registry.
From an efficiency perspective, it would be much easier to just put the data online as pds4/bundles/cooldata_v1.0
and pds4/bundles/cooldata_v2.0
from the start, and then just load the data as the new versions come online and that is it. This would require no manual intervention of movement of files, and would decrease overhead over time. That being said, we understand that some nodes prefer "clean" archive directories that include only the latest versions of data products. So we will need to implement some sort of utility. We also hope to avoid the need to do this down the road by providing some web app using the registry to drive "directory views" of pages, so we can obfuscate those old versions of the users unless they want to see them.
Happy to talk more about this or we can discuss at the SWG on Wednesday.
@mace-space the corrected package from your comment has a validate error. Please review this report for details. Thank you!
Thanks @c-suh. Sorry to have missed this. It appears that the validate error may be due to extra slashes in the filepaths (field 2) from record 3 onwards and this is causing validate to interpret it as a null field. Do you know what might be causing the additional slash?
urn:nasa:pds:cassini_uvis_solarocc_beckerjarmak2023::1.0 /bundle.xml
urn:nasa:pds:cassini_uvis_solarocc_beckerjarmak2023::1.0 /readme.txt
urn:nasa:pds:cassini_uvis_solarocc_beckerjarmak2023:data::2.0 //collection_data.csv
urn:nasa:pds:cassini_uvis_solarocc_beckerjarmak2023:data::2.0 //collection_data.xml
urn:nasa:pds:cassini_uvis_solarocc_beckerjarmak2023:data:uvis_euv_2006_257_solar_time_series_ingress::1.1 //uvis_euv_2006_257_solar_time_series_ingress.xml
...
(I had to delete a lot of whitespace between field 1 and 2 to get it to display here)
I ran the pds-deep-registry-archive
tool in the same manner as for other bundles previously submitted to NSSDCA, but I'm wondering if I somehow introduced this error? I'm using v1.1.5 of pds-deep-registry-archive
It also appears that the VIDs are wrong – 2.0
and 1.1
, instead of 1.0
@mace-space apologies here. this is another bug in our software. we are investigating and will get back to you here.
$ pds-deep-registry-archive -s PDS_RNG urn:nasa:pds:cassini_uvis_solarocc_beckerjarmak2023::1.1
% pds-deep-registry-archive --site PDS_RNG urn:nasa:pds:cassini_uvis_solarocc_beckerjarmak2023::1.0
Thanks for looking into this
Blocked by https://github.com/NASA-PDS/deep-archive/issues/164, which is blocked by https://github.com/NASA-PDS/registry/issues/185
Blocked by NASA-PDS/deep-archive#164, which is blocked by NASA-PDS/registry#185
@jordanpadams: Does this mean that we should simply stand by until EN resolves these issues?
@matthewtiscareno unfortunately yes. The fix is in work, but until we have a working API up and running, end users can’t really run pds-deep-registry-archive
@c-suh here is an updated SIP package: cassini_package.zip
@mace-space the 2 sets provided by Jordan have been posted for NSSDCA processing! From tomorrow, you can check the status at https://nssdc.gsfc.nasa.gov/psi/ReportPDS4.jsp using the SIP LIDs below:
SIP LIDs: urn:nasa:pds:system_bundle:product_sip_deep_archive:cassini_uvis_solarocc_beckerjarmak2023_v1.0_20240903 urn:nasa:pds:system_bundle:product_sip_deep_archive:cassini_uvis_solarocc_beckerjarmak2023_v1.1_20240903
Discipline Node Information
Delivering Node: Ring-Moon Systems
NSSDCA Delivery Package: cassini_uvis_solarocc_beckerjarmak2023_v1.0_NSSDCA_20231221.tar.gz
Validation report: cassini_uvis_solarocc_beckerjarmak2023_validate.log
Engineering Node Process
See the internal EN process at https://pds-engineering.jpl.nasa.gov/content/nssdca_interface_process