NASA-PDS / operations

Tickets for the PDSEN Operations Team
Other
5 stars 1 forks source link

NSSDCA Delivery: Apollo & Misc. Bundles #60

Closed ARDWhite closed 3 years ago

ARDWhite commented 3 years ago

Overview

This delivery consists of a total of 37 bundle products for Apollo, Kaguya, Magellan, MER, Messenger, Phoenix, Pioneer, and laboratory data sets.

It was agreed with Catherine Suh via personal communication that the deliveries could be merged in a single issue.

Discipline Node Information

Delivering Node: Geosciences Node

NSSDCA Delivery Package:

a12side_ccig_raw_arcsav_v1.0_20210218.zip a12sws_raw_arcsav_v1.0_20210218.zip a15_17_hfe_concatenated_v1.0_20210218.zip a15hfe_calibrated_arcsav_v1.0_20210218_aip.zip a15hfe_raw_arcsav_v1.0_20210218.zip a15oms_v1.0_20210218.zip a15photosupportdata_v1.0_20210218.zip a15side_ccig_raw_arcsav_v1.0_20210218.zip a16lsm_raw_arcsav_v1.0_20210218.zip a16oms_v1.0_20210218.zip a16photosupportdata_v1.0_20210218.zip a17fuvs_v1.0_20210218.zip a17hfe_calibrated_arcsav_v1.0_20210218.zip a17hfe_raw_arcsav_v1.0_20210218.zip a17leam_raw_arcsav_v1.0_20210218.zip a17leam_raw_worktape_v1.0_20210218.zip a17leamcal_v1.0_20210218.zip a17lsg_raw_arcsav_v1.0_20210218.zip a17photosupportdata_v1.1_20210219.zip a17sep_v1.0_20210219.zip apollo_seismic_event_catalog_v1.0_20210219.zip apollodoc_v1.0_20210219.zip asurpif_photos_amboycrater_v1.0_20210219.zip carbonate_refractive_indices_v1.0_20210219.zip kaguya_grs_spectra_v1.1_20210218.zip lab_shocked_feldspars_v2.0_20210219.zip magellan_stereo_topography_v1.0_20210218.zip mer_cs_target_list_v1.1_20210219.zip mer_documentation_v1.0_20210218.zip mer_pancam_photometry_v2.0_20210218.zip phx_tega_derived_v1.1_20210219.zip pioneer89cdd_v1.1_20210219.zip pioneerdoc_v1.0_20210219.zip relab_v2.0_20210219.zip ruff_pdart14_mtes_v1.0_20210219.zip trang2017_mercury_space_weathering_v1.0_20210218.zip trang2020_moon_space_weathering_v1.0_20210218.zip


Engineering Node Process

See the internal EN process at https://pds-engineering.jpl.nasa.gov/content/nssdca_interface_process

c-suh commented 3 years ago

@ARDWhite, I've noticed that a17hfe_raw_arcsav_v1.0_20210208.zip contains an AIP xml file of the same name but the rest of the contents (AIP xml, SIP tab and xml, checksum and transfer manifests) are labeled as calibrated. How would you like this handled?

c-suh commented 3 years ago

@ARDWhite 22 of the 23 bundles have been submitted to the NSSDCA and are currently in processing.

Using the SIP LID(s) below, you can check the status of this submission at https://nssdc.gsfc.nasa.gov/psi/ReportPDS4.jsp.

SIP LID(s): urn:nasa:pds:system_bundle:product_sip_deep_archive:a15hfe_calibrated_arcsav_v1.0_20210208 urn:nasa:pds:system_bundle:product_sip_deep_archive:a15hfe_raw_arcsav_v1.0_20210208 urn:nasa:pds:system_bundle:product_sip_deep_archive:a15oms_v1.0_20210208 urn:nasa:pds:system_bundle:product_sip_deep_archive:a15photosupportdata_v1.0_20210208 urn:nasa:pds:system_bundle:product_sip_deep_archive:a15side_ccig_raw_arcsav_v1.0_20210208 urn:nasa:pds:system_bundle:product_sip_deep_archive:a16lsm_raw_arcsav_v1.0_20210208 urn:nasa:pds:system_bundle:product_sip_deep_archive:a16oms_v1.0_20210208 urn:nasa:pds:system_bundle:product_sip_deep_archive:a16photosupportdata_v1.0_20210208 urn:nasa:pds:system_bundle:product_sip_deep_archive:a17fuvs_v1.0_20210208 urn:nasa:pds:system_bundle:product_sip_deep_archive:a17lsg_raw_arcsav_v1.0_20210208 urn:nasa:pds:system_bundle:product_sip_deep_archive:a17photosupportdata_v1.1_20210208 urn:nasa:pds:system_bundle:product_sip_deep_archive:a17sep_v1.0_20210208 urn:nasa:pds:system_bundle:product_sip_deep_archive:apollodoc_v1.0_20210208 urn:nasa:pds:system_bundle:product_sip_deep_archive:duxbury_pdart14_mariner69_v1.0_20210208 urn:nasa:pds:system_bundle:product_sip_deep_archive:kaguya_grs_spectra_v1.1_20210205 urn:nasa:pds:system_bundle:product_sip_deep_archive:lab_shocked_feldspars_v2.0_20210208 urn:nasa:pds:system_bundle:product_sip_deep_archive:magellan_stereo_topography_v1.0_20210208 urn:nasa:pds:system_bundle:product_sip_deep_archive:phx_tega_derived_v1.1_20210208 urn:nasa:pds:system_bundle:product_sip_deep_archive:pioneer89cdd_v1.1_20210208 urn:nasa:pds:system_bundle:product_sip_deep_archive:pioneerdoc_v1.0_20210208 urn:nasa:pds:system_bundle:product_sip_deep_archive:trang2017_mercury_space_weathering_v1.0_20210208 urn:nasa:pds:system_bundle:product_sip_deep_archive:trang2020_moon_space_weathering_v1.0_20210208

ARDWhite commented 3 years ago

@c-suh Thank you for catching this. There were meant to be two bundles for A17 HFE raw and A17 HFE calibrated data, but it seems the files got mixed up together. Attached are the two distinct bundles in their proper forms:

a17hfe_calibrated_arcsav_v1.0_20210208.zip

a17hfe_raw_arcsav_v1.0_20210208.zip

smclaughlin7 commented 3 years ago

Hi Andrew, PDS Operator:

The NSSDCA's front-end system rejected all 25 SIPs submitted earlier this week because URLs in the SIP manifests (sip.tab) are malformed. In particular the last subdirectory in the file path is repeated which causes a File Not Found error. For example the first record in https://pds.nasa.gov/data/pds4/manifests/2021/a12side_ccig_raw_arcsav_v1.0_20210208_sip_v1.0.tab specifies this malformed URL:

https://pds-geosciences.wustl.edu/lunar/urn-nasa-pds-a12side_ccig_raw_arcsav/urn-nasa-pds-a12side_ccig_raw_arcsav/bundle.xml

I listed the failed SIP LIDVIDs below. Please regenerate a new version of the SIP product and post it in the repository. Our (NSSDCA) automated SIP Submitter will scan the repository at 11:59 pm PST, detecting new SIPs and entering them into our submission queue. (You no longer need to use our Planetary Submission Interface, https://nssdc.gsfc.nasa.gov/psi/index.jsp, to send SIPs to us!)

Thanks!

urn:nasa:pds:system_bundle:product_sip_deep_archive:a12side_ccig_raw_arcsav_v1.0_20210208::1.0 urn:nasa:pds:system_bundle:product_sip_deep_archive:trang2020_moon_space_weathering_v1.0_20210208::1.0 urn:nasa:pds:system_bundle:product_sip_deep_archive:trang2017_mercury_space_weathering_v1.0_20210208::1.0 urn:nasa:pds:system_bundle:product_sip_deep_archive:pioneerdoc_v1.0_20210208::1.0 urn:nasa:pds:system_bundle:product_sip_deep_archive:pioneer89cdd_v1.1_20210208::1.0 urn:nasa:pds:system_bundle:product_sip_deep_archive:phx_tega_derived_v1.1_20210208::1.0 urn:nasa:pds:system_bundle:product_sip_deep_archive:magellan_stereo_topography_v1.0_20210208::1.0 urn:nasa:pds:system_bundle:product_sip_deep_archive:lab_shocked_feldspars_v2.0_20210208::1.0 urn:nasa:pds:system_bundle:product_sip_deep_archive:kaguya_grs_spectra_v1.1_20210205::1.0 urn:nasa:pds:system_bundle:product_sip_deep_archive:duxbury_pdart14_mariner69_v1.0_20210208::1.0 urn:nasa:pds:system_bundle:product_sip_deep_archive:apollodoc_v1.0_20210208::1.0 urn:nasa:pds:system_bundle:product_sip_deep_archive:a17sep_v1.0_20210208::1.0 urn:nasa:pds:system_bundle:product_sip_deep_archive:a17photosupportdata_v1.1_20210208::1.0 urn:nasa:pds:system_bundle:product_sip_deep_archive:a17lsg_raw_arcsav_v1.0_20210208::1.0 urn:nasa:pds:system_bundle:product_sip_deep_archive:a17fuvs_v1.0_20210208::1.0 urn:nasa:pds:system_bundle:product_sip_deep_archive:a16photosupportdata_v1.0_20210208::1.0 urn:nasa:pds:system_bundle:product_sip_deep_archive:a16oms_v1.0_20210208::1.0 urn:nasa:pds:system_bundle:product_sip_deep_archive:a16lsm_raw_arcsav_v1.0_20210208::1.0 urn:nasa:pds:system_bundle:product_sip_deep_archive:a15_17_hfe_concatenated_v1.0_20210208::1.0 urn:nasa:pds:system_bundle:product_sip_deep_archive:a15side_ccig_raw_arcsav_v1.0_20210208::1.0 urn:nasa:pds:system_bundle:product_sip_deep_archive:a15photosupportdata_v1.0_20210208::1.0 urn:nasa:pds:system_bundle:product_sip_deep_archive:a15oms_v1.0_20210208::1.0 urn:nasa:pds:system_bundle:product_sip_deep_archive:a15hfe_raw_arcsav_v1.0_20210208::1.0 urn:nasa:pds:system_bundle:product_sip_deep_archive:a15hfe_calibrated_arcsav_v1.0_20210208::1.0 urn:nasa:pds:system_bundle:product_sip_deep_archive:a12sws_raw_arcsav_v1.0_20210208::1.0

smclaughlin7 commented 3 years ago

Hi Andrew, Yesterday Jennifer Ward told me this SIP/bundle is being revised and should not have been submitted to the NSSDCA (it failed our front-end checks, so no worries): urn:nasa:pds:system_bundle:product_sip_deep_archive:duxbury_pdart14_mariner69_v1.0_20210208::1.0. Perhaps coordinate the submission of this bundle with her? Thanks!

ARDWhite commented 3 years ago

@smclaughlin7 Yes, the Duxbury dataset was taken off the list and should not be processed. Jenn and I are now working to correct some miscommunications on our end. @c-suh Thank you for alerting me. I will regenerate the SIPs promptly. My apologies for the confusion.

smclaughlin7 commented 3 years ago

Great! Thanks Andrew, Catherine.

ARDWhite commented 3 years ago

@c-suh @smclaughlin7 After consulting with Jenn, I have compiled the full list of 37 bundles, with repaired URLs.

I have edited the original post in this issue to include the proper bundles. Please inform me if a different method is preferable to place them online.

Thank you!

c-suh commented 3 years ago

@ARDWhite, sorry for the delay! I will hopefully have these posted today but will let you know if any problems surface.

Confirming that the AIPs/SIPs from https://github.com/NASA-PDS/pdsen-operations/issues/57, https://github.com/NASA-PDS/pdsen-operations/issues/56, and https://github.com/NASA-PDS/pdsen-operations/issues/55 have been folded into here?

c-suh commented 3 years ago

@ARDWhite, 37 sets have been submitted to the NSSDCA and are currently in processing.

Using the SIP LID(s) below, you can check the status of this submission at https://nssdc.gsfc.nasa.gov/psi/ReportPDS4.jsp.

SIP LID(s): urn:nasa:pds:system_bundle:product_sip_deep_archive:a12side_ccig_raw_arcsav_v1.0_20210218 urn:nasa:pds:system_bundle:product_sip_deep_archive:a12sws_raw_arcsav_v1.0_20210218 urn:nasa:pds:system_bundle:product_sip_deep_archive:a15_17_hfe_concatenated_v1.0_20210218 urn:nasa:pds:system_bundle:product_sip_deep_archive:a15hfe_calibrated_arcsav_v1.0_20210218 urn:nasa:pds:system_bundle:product_sip_deep_archive:a15hfe_raw_arcsav_v1.0_20210218 urn:nasa:pds:system_bundle:product_sip_deep_archive:a15oms_v1.0_20210218 urn:nasa:pds:system_bundle:product_sip_deep_archive:a15photosupportdata_v1.0_20210218 urn:nasa:pds:system_bundle:product_sip_deep_archive:a15side_ccig_raw_arcsav_v1.0_20210218 urn:nasa:pds:system_bundle:product_sip_deep_archive:a16lsm_raw_arcsav_v1.0_20210218 urn:nasa:pds:system_bundle:product_sip_deep_archive:a16oms_v1.0_20210218 urn:nasa:pds:system_bundle:product_sip_deep_archive:a16photosupportdata_v1.0_20210218 urn:nasa:pds:system_bundle:product_sip_deep_archive:a17fuvs_v1.0_20210218 urn:nasa:pds:system_bundle:product_sip_deep_archive:a17hfe_calibrated_arcsav_v1.0_20210218 urn:nasa:pds:system_bundle:product_sip_deep_archive:a17hfe_raw_arcsav_v1.0_20210218 urn:nasa:pds:system_bundle:product_sip_deep_archive:a17leam_raw_arcsav_v1.0_20210218 urn:nasa:pds:system_bundle:product_sip_deep_archive:a17leam_raw_worktape_v1.0_20210218 urn:nasa:pds:system_bundle:product_sip_deep_archive:a17leamcal_v1.0_20210218 urn:nasa:pds:system_bundle:product_sip_deep_archive:a17lsg_raw_arcsav_v1.0_20210218 urn:nasa:pds:system_bundle:product_sip_deep_archive:a17photosupportdata_v1.1_20210219 urn:nasa:pds:system_bundle:product_sip_deep_archive:a17sep_v1.0_20210219 urn:nasa:pds:system_bundle:product_sip_deep_archive:apollo_seismic_event_catalog_v1.0_20210219 urn:nasa:pds:system_bundle:product_sip_deep_archive:apollodoc_v1.0_20210219 urn:nasa:pds:system_bundle:product_sip_deep_archive:asurpif_photos_amboycrater_v1.0_20210219 urn:nasa:pds:system_bundle:product_sip_deep_archive:carbonate_refractive_indices_v1.0_20210219 urn:nasa:pds:system_bundle:product_sip_deep_archive:kaguya_grs_spectra_v1.1_20210218 urn:nasa:pds:system_bundle:product_sip_deep_archive:lab_shocked_feldspars_v2.0_20210219 urn:nasa:pds:system_bundle:product_sip_deep_archive:magellan_stereo_topography_v1.0_20210218 urn:nasa:pds:system_bundle:product_sip_deep_archive:mer_cs_target_list_v1.1_20210219 urn:nasa:pds:system_bundle:product_sip_deep_archive:mer_documentation_v1.0_20210218 urn:nasa:pds:system_bundle:product_sip_deep_archive:mer_pancam_photometry_v2.0_20210218 urn:nasa:pds:system_bundle:product_sip_deep_archive:phx_tega_derived_v1.1_20210219 urn:nasa:pds:system_bundle:product_sip_deep_archive:pioneer89cdd_v1.1_20210219 urn:nasa:pds:system_bundle:product_sip_deep_archive:pioneerdoc_v1.0_20210219 urn:nasa:pds:system_bundle:product_sip_deep_archive:relab_v2.0_20210219 urn:nasa:pds:system_bundle:product_sip_deep_archive:ruff_pdart14_mtes_v1.0_20210219 urn:nasa:pds:system_bundle:product_sip_deep_archive:trang2017_mercury_space_weathering_v1.0_20210218 urn:nasa:pds:system_bundle:product_sip_deep_archive:trang2020_moon_space_weathering_v1.0_20210218

smclaughlin7 commented 3 years ago

@jordanpadams @c-suh @ARDWhite @elawsgh

This SIP failed the NSSDCA's front-end processing because the bundle product contains smart quotes: urn:nasa:pds:system_bundle:product_sip_deep_archive:kaguya_grs_spectra_v1.1_20210218.

Details: In bundle product https://pds-geosciences.wustl.edu/lunar/urn-nasa-pds-kaguya_grs_spectra/bundle_kaguya_derived.xml, the element Product_Bundle.Identification_Area.Citation_Information.description contains the smart-quoted string “Kaguya” but the PDS schema restricts the data type to UTF8_Text_Preserved which I believe only handles straight double quotes (""). Of course this is my simple interpretation, so could someone EN who better understands the standards, the schema, and UTF-8 encoding please check this? Much appreciated!

Also Validation Tool (v1.24.0, 2020-09-08) does not appear to check that the description element contains only UTF-8 characters. Should it?

Possible Resolution: If PDS4 standards do not allow smart quotes in the the description element, then GEO should replace those characters with simple straight double quotes, then regenerate a new SIP for the Kaguya bundle and resubmit that SIP to us.

smclaughlin7 commented 3 years ago

@jordanpadams @c-suh @ARDWhite @elawsgh

Regarding the Possible Solution for SIP urn:nasa:pds:system_bundle:product_sip_deep_archive:kaguya_grs_spectra_v1.1_20210218, when regenerating a new SIP, please use "https" instead of "http" for the file URLs specified the SIP manifest table. Thanks!

ARDWhite commented 3 years ago

@smclaughlin7

I will regenerate a new SIP for the Kaguya bundle with the changes to URLs and characters, and upload here.

smclaughlin7 commented 3 years ago

@ARDWhite @c-suh @jordanpadams

Thanks! Andrew could you please regenerate all 37 SIPs for this delivery, changing all file URLs in the SIP manifest tables (sip.tab) from "http" instead of "https" and incrementing the version ID (VID) of the SIP product? (The Kaguya bundle is included in this set.)

All 37 SIPs failed our ingest because of checksum mismatches between the files downloaded via http and the values specified in the SIP manifest tables, which appear to contain the correct checksums.

Hopefully this will be the last time you need to remake those SIPs!

ARDWhite commented 3 years ago

@smclaughlin7 Will do.

smclaughlin7 commented 3 years ago

@ARDWhite

Thanks! Oops - I had a typo in my comment from 4 hours ago. The URLs in the SIP manifest tables need to be changed from "http" to "https".

ARDWhite commented 3 years ago

@smclaughlin7

Thank you for confirming. I am making those changes now.

Does the VID need to be incremented in every instance in the SIP XMLs - i.e., under logical_identifier, and/or Modification_Detail in the SIP XMLs?

smclaughlin7 commented 3 years ago

@ARDWhite @c-suh

Ah, I see that you've been changing the logical_identifier (LID) in the SIP XMLs when regenerating new ones, e.g.:

_urn:nasa:pds:system_bundle:product_sip_deep_archive:a12side_ccig_raw_arcsav_v1.0_20210208 and urn:nasa:pds:system_bundle:product_sip_deep_archive:a12side_ccig_raw_arcsav_v1.0_20210218_

That is acceptable because it makes the SIP's LIDVID unique, which is what the NSSDCA (and PDS4) requires.

The SIP is a PDS4 product, and every PDS4 product must have a unique product identifier, aka LIDVID. Therefore every SIP must have a unique LIDVID (a combination of . and .). When submitting a revised SIP to the NSSDCA, we require the SIP's LIDVID -- that is, . + . -- to be unique. So changing the date string in the SIP XML's logical_identifier would result in a new, unique SIP LIDVID. (You can keep the VID at 1.0 as long as the LID changes.)

The NSSDCA does not look at the in .. Modification_History is for PDS's informational purposes only.

Let me know if you have more questions, and I'll try to help.

ARDWhite commented 3 years ago

@smclaughlin7 @c-suh

Thank you for the detailed explanation. I have changed "http" to "https" in the SIP TABs and version_ID to 2.0 in the SIP XMLs for each of these bundles, and repackaged them with the addendum "_v2.0" to distinguish them.

Please let me know if there is anything still missing.

pioneer89cdd_v1.1_20210219_v2.0.zip pioneerdoc_v1.0_20210219_v2.0.zip relab_v2.0_20210219_v2.0.zip ruff_pdart14_mtes_v1.0_20210219_v2.0.zip trang2017_mercury_space_weathering_v1.0_20210218_v2.0.zip trang2020_moon_space_weathering_v1.0_20210218_v2.0.zip a12side_ccig_raw_arcsav_v1.0_20210218_v2.0.zip a12sws_raw_arcsav_v1.0_20210218_v2.0.zip a15_17_hfe_concatenated_v1.0_20210218_v2.0.zip a15hfe_calibrated_arcsav_v1.0_20210218_v2.0.zip a15hfe_raw_arcsav_v1.0_20210218_v2.0.zip a15oms_v1.0_20210218_v2.0.zip a15photosupportdata_v1.0_20210218_v2.0.zip a15side_ccig_raw_arcsav_v1.0_20210218_v2.0.zip a16lsm_raw_arcsav_v1.0_20210218_v2.0.zip a16oms_v1.0_20210218_v2.0.zip a16photosupportdata_v1.0_20210218_v2.0.zip a17fuvs_v1.0_20210218_v2.0.zip a17hfe_calibrated_arcsav_v1.0_20210218_v2.0.zip a17hfe_raw_arcsav_v1.0_20210218_v2.0.zip a17leam_raw_arcsav_v1.0_20210218_v2.0.zip a17leam_raw_worktape_v1.0_20210218_v2.0.zip a17leamcal_v1.0_20210218_v2.0.zip a17lsg_raw_arcsav_v1.0_20210218_v2.0.zip a17photosupportdata_v1.1_20210219_v2.0.zip a17sep_v1.0_20210219_v2.0.zip apollo_seismic_event_catalog_v1.0_20210219_v2.0.zip apollodoc_v1.0_20210219_v2.0.zip asurpif_photos_amboycrater_v1.0_20210219_v2.0.zip carbonate_refractive_indices_v1.0_20210219_v2.0.zip kaguya_grs_spectra_v1.1_20210218_v2.0.zip lab_shocked_feldspars_v2.0_20210219_v2.0.zip magellan_stereo_topography_v1.0_20210218_v2.0.zip mer_cs_target_list_v1.1_20210219_v2.0.zip mer_documentation_v1.0_20210218_v2.0.zip mer_pancam_photometry_v2.0_20210218_v2.0.zip phx_tega_derived_v1.1_20210219_v2.0.zip

smclaughlin7 commented 3 years ago

@ARDWhite @c-suh Thank you for remaking all 37 SIPs.

Catherine: I told Andrew it's OK to increment the SIP's version ID (VID) while keeping the same timestamp string in its logical identifier (LID). I forgot to ask you first if EN recommended a specific formation convention for a SIP's LID+VID and timestamp usage. I hope I did not give Andrew incorrect info!

c-suh commented 3 years ago

Hi @smclaughlin7. I think as long as the LIDVID is unique, this is fine. However, I don't know if this is preferred and would defer to @jordanpadams on this. I do not know what would warrant a bump in the version, and any change to a part of the LIDVID would make it uniquely identifiable, so a different timestamp (of creation?) might suffice.

jordanpadams commented 3 years ago

@smclaughlin7 @c-suh the LIDVID formation is not relevant for us. the only reason we include a timestamp in the LID is so that we always ensure unique identifiers when running pds-deep-archive instead of trying to update everything manually.

smclaughlin7 commented 3 years ago

@jordanpadams @c-suh Good to know the LIDVID formation is not relevant for PDS but that you include a timestamp (and the SIP VID "v2.0" as @ARDWhite did) to ensure unique identifiers when running pds-deep-archive. Thanks,

c-suh commented 3 years ago

@ARDWhite @jordanpadams @smclaughlin7 if you compare the original zip filenames to this latest batch, it appears that the version number has not been incrementally increased but rather that a new version number has been appended. i.e.

The files inside the zips do not have this second version number nor have the original version numbers been altered in any way. I did not catch this until after posting the files in https://pds.nasa.gov/data/pds4/manifests/2021/; consequently, the existing 37 AIP/SIP sets have been replaced instead of a new batch having been added. How should this be handled?

jordanpadams commented 3 years ago

@c-suh we can take care of this internally.

@ARDWhite in the future, I recommend re-running pds-deep-archive instead of manually updating labels, wherever possible. There are lots of little things that the software takes into account that can cause issues down the road.

smclaughlin7 commented 3 years ago

@ARDWhite Definitely re-run pds-deep-archive instead of manually updating labels whenever possible as @jordanpadams recommends.

One of the little things the software takes care of, for example, is recomputing the checksum for the SIP manifest table and inserting that value in . The NSSDCA compares that checksum to the one it computes upon downloading the SIP manifest table. We reject the SIP if there's a mismatch.

smclaughlin7 commented 3 years ago

@ARDWhite @c-suh The NSSDCA rejected all 37 SIPs submitted yesterday because the in the SIP xml was not changed. That is the checksum for the SIP manifest table, which definitely changed because the URLs in that file were changed from http to https. Also we noticed the filenames for the revised SIP product include the old string for v1.0 instead of v2.0. As @jordanpadams recommended, it's best to re-run the pds-deep-archive tool to make all new SIP/AIP products for the 37 bundles. You might want to revert back to SIP/AIP version (VID) 1.0 and use a timestamp of today's date in the SIP/AIP LID for the re-run. Just a suggestion, though.

ARDWhite commented 3 years ago

@ARDWhite in the future, I recommend re-running pds-deep-archive instead of manually updating labels, wherever possible. There are lots of little things that the software takes into account that can cause issues down the road.

That was my original inclination, so I was surprised by the suggestion of alternatives. I will reassemble the 37 bundles with pds-deep-archive, and check that they comply with the additional specifications noted in this thread. They should be ready later this week.

smclaughlin7 commented 3 years ago

@ARDWhite Great! Thanks for the update and your patience. @jordanpadams @c-suh

c-suh commented 3 years ago

@smclaughlin7, in my zeal to apply this new step of updating files' modify dates per comment https://github.com/NASA-PDS/pdsen-operations/issues/71#issuecomment-795365433, I did so to the above AIP/SIP sets, and I assume that your automator will attempt to process these again. Apologies, and thank you for your patience as well!

smclaughlin7 commented 3 years ago

@c-suh No worries. You're in luck! Our automator did not pull over the re-posted SIPs because they still had the same LIDVIDs (VIDs were still 2.0) as the previous submission. Our front-end records new SIP LIDVIDs. When it sees the same LIDVID again, it will not pull over that SIP.

ARDWhite commented 3 years ago

@c-suh @smclaughlin7

I am assembling the remade 37 data bundles to be compliant with everything listed over the past month. I'm very sorry for the delays and thankful for your patient explanations.

I have made sure each SIP TAB uses "https:" in its URLs. However, there are some instances of "http:" URLs in files that are not SIP TABs. Just so I am absolutely clear: should these also be changed to "https:" URLs, or should they remain as "http:" URLs where they are written?

smclaughlin7 commented 3 years ago

@ARDWhite @c-suh Catherine should confirm, but my understanding is that the two other TAB files that the Deep Archive Tool generates, _CHECKSUM_MANIFEST and TRANSFERMANIFEST, should only use the path+filename (never the URL) for each file listed.

The NSSDCA only uses the SIP TAB and SIP XML files for a delivery. We need the SIP TAB to specify the URL to each file in order to download. And we need those URLs to use "https::" (just like the online archive at Geosciences). Otherwise when we download a file and compute its checksum on the fly, "http::" encryption results in different checksum that does not match the one given in SIP TAB.

The other files that the Deep Archive tool generates, such as AIP TAB/XML, are used by PDS for internal purposes.

jordanpadams commented 3 years ago

@ARDWhite I hate to be a pain here, but to fix all these issues, would you mind re-running the software on the appropriate bundles instead of trying to manually update these products? we went through a lot of this headache of all the breadcrumbs you have to follow when something changes. sorry again, but hopefully that will make this next iteration straightforward. per the https notes, the only thing that needs to update here is the input to the -b flag should include a URL that starts with HTTPS, e.g. https://pds-geosciences.wustl.edu/path/to/bundle

  -b BUNDLE_BASE_URL, --bundle-base-url BUNDLE_BASE_URL
                        Base URL for Node data archive. This URL will be prepended to the bundle
                        directory to form URLs to the products. For example, if we are generating
                        a SIP for mission_bundle/LADEE_Bundle_1101.xml, and bundle-base-url is
                        https://atmos.nmsu.edu/PDS/data/PDS4/LADEE/, the URL in the SIP will be ht
                        tps://atmos.nmsu.edu/PDS/data/PDS4/LADEE/mission_bundle/LADEE_Bundle_1101.
                        xml.

@smclaughlin7 can we maybe iterate over this via email? we can update our docs to request nodes use HTTPS, but this issue between http and https is fairly common nowadays in software, so I feel like we should look into ways NSSDCA could update their ingest software to be robust in their handling of this use case.

jordanpadams commented 3 years ago

This SIP failed the NSSDCA's front-end processing because the bundle product contains smart quotes: urn:nasa:pds:system_bundle:product_sip_deep_archive:kaguya_grs_spectra_v1.1_20210218.

Details: In bundle product https://pds-geosciences.wustl.edu/lunar/urn-nasa-pds-kaguya_grs_spectra/bundle_kaguya_derived.xml, the element Product_Bundle.Identification_Area.Citation_Information.description contains the smart-quoted string “Kaguya” but the PDS schema restricts the data type to UTF8_Text_Preserved which I believe only handles straight double quotes (""). Of course this is my simple interpretation, so could someone EN who better understands the standards, the schema, and UTF-8 encoding please check this? Much appreciated!

Also Validation Tool (v1.24.0, 2020-09-08) does not appear to check that the description element contains only UTF-8 characters. Should it?

Possible Resolution: If PDS4 standards do not allow smart quotes in the the description element, then GEO should replace those characters with simple straight double quotes, then regenerate a new SIP for the Kaguya bundle and resubmit that SIP to us.

@smclaughlin7 @ARDWhite do we have a copy of the original version of https://pds-geosciences.wustl.edu/lunar/urn-nasa-pds-kaguya_grs_spectra/bundle_kaguya_derived.xml that had an invalid character in it? we are having trouble replicating this issue.

smclaughlin7 commented 3 years ago

@jordanpadams @ARDWhite I squirreled away a copy of the original version with smart double-quotes around the word 'Kaguya' on line 22. Here you go: bundle_kaguya_derived_HasSmartQuotes.xml.txt (I renamed the file; that's it.)

ARDWhite commented 3 years ago

@ARDWhite I hate to be a pain here, but to fix all these issues, would you mind re-running the software on the appropriate bundles instead of trying to manually update these products? we went through a lot of this headache of all the breadcrumbs you have to follow when something changes. sorry again, but hopefully that will make this next iteration straightforward. per the https notes, the only thing that needs to update here is the input to the -b flag should include a URL that starts with HTTPS, e.g. https://pds-geosciences.wustl.edu/path/to/bundle

@jordanpadams Not a pain at all! That was my plan all along - I only tried manually updating the files because I thought that was what @smclaughlin7 was suggesting. Very sorry for the resulting delay.

I have regenerated all 37 data bundles with pds-deep-archive. All of them should be time-stamped March 15 with version_ID = 1.0 to generate unique LIDVIDs. The smart quotes in the Kaguya bundle were changed to more basic characters before running pds-deep-archive. I input HTTPS URLs into pds-deep-archive, and there should be few to no instances of HTTP URLs in any of the files.

a12side_ccig_raw_arcsav_v1.0_20210315.zip a12sws_raw_arcsav_v1.0_20210315.zip a15_17_hfe_concatenated_v1.0_20210315.zip a15hfe_calibrated_arcsav_v1.0_20210315.zip a15hfe_raw_arcsav_v1.0_20210315.zip a15oms_v1.0_20210315.zip a15photosupportdata_v1.0_20210315.zip a15side_ccig_raw_arcsav_v1.0_20210315.zip a16lsm_raw_arcsav_v1.0_20210315.zip a16oms_v1.0_20210315.zip a16photosupportdata_v1.0_20210315.zip a17fuvs_v1.0_20210315.zip a17hfe_calibrated_arcsav_v1.0_20210315.zip a17hfe_raw_arcsav_v1.0_20210315.zip a17leam_raw_arcsav_v1.0_20210315.zip a17leam_raw_worktape_v1.0_20210315.zip a17leamcal_v1.0_20210315.zip a17lsg_raw_arcsav_v1.0_20210315.zip a17photosupportdata_v1.1_20210315.zip a17sep_v1.0_20210315.zip apollo_seismic_event_catalog_v1.0_20210315.zip apollodoc_v1.0_20210315.zip asurpif_photos_amboycrater_v1.0_20210315.zip carbonate_refractive_indices_v1.0_20210315.zip kaguya_grs_spectra_v1.1_20210315.zip lab_shocked_feldspars_v2.0_20210315.zip magellan_stereo_topography_v1.0_20210315.zip mer_cs_target_list_v1.1_20210315.zip mer_documentation_v1.0_20210315.zip mer_pancam_photometry_v2.0_20210315.zip phx_tega_derived_v1.1_20210315.zip pioneer89cdd_v1.1_20210315.zip pioneerdoc_v1.0_20210315.zip relab_v2.0_20210315.zip ruff_pdart14_mtes_v1.0_20210315.zip trang2017_mercury_space_weathering_v1.0_20210315.zip trang2020_moon_space_weathering_v1.0_20210315.zip

As before, please inform me if there is anything wrong with any part of this delivery.

c-suh commented 3 years ago

@ARDWhite fingers crossed! These sets have been posted for NSSDCA processing. Beginning tomorrow, you can check the status at https://nssdc.gsfc.nasa.gov/psi/ReportPDS4.jsp using the SIP LIDs below:

SIP LIDs: urn:nasa:pds:system_bundle:product_sip_deep_archive:a12side_ccig_raw_arcsav_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:a12sws_raw_arcsav_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:a15_17_hfe_concatenated_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:a15hfe_calibrated_arcsav_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:a15hfe_raw_arcsav_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:a15oms_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:a15photosupportdata_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:a15side_ccig_raw_arcsav_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:a16lsm_raw_arcsav_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:a16oms_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:a16photosupportdata_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:a17fuvs_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:a17hfe_calibrated_arcsav_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:a17hfe_raw_arcsav_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:a17leam_raw_arcsav_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:a17leam_raw_worktape_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:a17leamcal_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:a17lsg_raw_arcsav_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:a17photosupportdata_v1.1_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:a17sep_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:apollo_seismic_event_catalog_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:apollodoc_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:asurpif_photos_amboycrater_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:carbonate_refractive_indices_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:kaguya_grs_spectra_v1.1_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:lab_shocked_feldspars_v2.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:magellan_stereo_topography_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:mer_cs_target_list_v1.1_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:mer_documentation_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:mer_pancam_photometry_v2.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:phx_tega_derived_v1.1_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:pioneer89cdd_v1.1_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:pioneerdoc_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:relab_v2.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:ruff_pdart14_mtes_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:trang2017_mercury_space_weathering_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:trang2020_moon_space_weathering_v1.0_20210315

c-suh commented 3 years ago

Status for all 37: pre-ingest

c-suh commented 3 years ago

Majority are in pre-ingest. The following 9 sets have been archived:

c-suh commented 3 years ago

@ARDWhite all sets have been archived successfully!

c-suh commented 3 years ago

@ARDWhite my apologies. Gross mistake on my part - the following 16 sets have not actually been archived:

urn:nasa:pds:system_bundle:product_sip_deep_archive:apollo_seismic_event_catalog_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:apollodoc_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:asurpif_photos_amboycrater_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:carbonate_refractive_indices_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:kaguya_grs_spectra_v1.1_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:lab_shocked_feldspars_v2.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:magellan_stereo_topography_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:mer_cs_target_list_v1.1_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:mer_documentation_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:mer_pancam_photometry_v2.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:phx_tega_derived_v1.1_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:pioneer89cdd_v1.1_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:pioneerdoc_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:relab_v2.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:ruff_pdart14_mtes_v1.0_20210315 urn:nasa:pds:system_bundle:product_sip_deep_archive:trang2017_mercury_space_weathering_v1.0_20210315

smclaughlin7 commented 3 years ago

@c-suh @jordanpadams @ARDWhite For this SIP

urn:nasa:pds:system_bundle:product_sip_deep_archive:ruff_pdart14_mtes_v1.0_20210315

our ingest process threw an error on the schema inventory collection product at https://pds-geosciences.wustl.edu/mer/urn-nasa-pds-ruff_pdart14_mtes/xml_schema/ because inventory file has an extension of ".txt" instead of the expected ".csv". The Standards and DPH do not explicitly prohibit ".txt" or ".tab" for inventory files but all the examples and text use ".csv".

Request: Could Geosciences please:

  1. change the extension of https://pds-geosciences.wustl.edu/mer/urn-nasa-pds-ruff_pdart14_mtes/xml_schema/collection_xmlschema_inventory.txt to ".csv"
  2. update that file name in the collection product label https://pds-geosciences.wustl.edu/mer/urn-nasa-pds-ruff_pdart14_mtes/xml_schema/collection_xmlschema.xml, then
  3. run the Deep Archive software to make a new SIP and submit it?

Thanks!

ARDWhite commented 3 years ago

@c-suh @smclaughlin7 @jordanpadams

No problem. Here is the new SIP after making those changes to the schema files.

ruff_pdart14_mtes_v1.0_20210407.zip

c-suh commented 3 years ago

@ARDWhite thank you for the quick turnaround! @smclaughlin7 the files have been posted to where your automator will pick it up tonight. Fingers crossed!

smclaughlin7 commented 3 years ago

@ARDWhite @c-suh Thanks for your rapid response!

smclaughlin7 commented 3 years ago

@ARDWhite @c-suh @jordanpadams Well this is embarassing! There is a second inventory table in the _ruff_pdart14_mtesv1.0 bundle that has “.txt” for its extension instead of “.csv”: https://pds-geosciences.wustl.edu/mer/urn-nasa-pds-ruff_pdart14_mtes/context/. I apologize for missing this last week when I asked you to fix the schema collection product!

Request: Could Geosciences please:

Thanks for your patience!

ARDWhite commented 3 years ago

@smclaughlin7 @c-suh @jordanpadams

Remade ruff_pdart14_mtes_v1.0 SIP: ruff_pdart14_mtes_v1.0_20210413.zip

c-suh commented 3 years ago

@smclaughlin7 new files have been posted and your automator should pick it up tonight!

c-suh commented 3 years ago

The 16 sets listed above have, with the exception of ruff_pdart14_mtes_v1.0, been archived!

urn:nasa:pds:system_bundle:product_sip_deep_archive:ruff_pdart14_mtes_v1.0_20210407 has been resubmitted and is in Pre-Ingest

urn:nasa:pds:system_bundle:product_sip_deep_archive:ruff_pdart14_mtes_v1.0_20210413 has been resubmitted