NASA-PDS / deep-archive

PDS Open Archival Information System (OAIS) utilities, including Submission Information Package (SIP) and Archive Information Package (AIP) generators
https://nasa-pds.github.io/deep-archive/
Other
7 stars 4 forks source link

Flag ".." in <file_name> as error condition #146

Closed nutjob4life closed 1 year ago

nutjob4life commented 1 year ago

πŸ—’οΈ Summary

Merge this to have pds-deep-archive raise an exception when encountering <file_name> entries that contain relative paths, such as ../catalog/dataset.cat, which according to this comment is invalid in a PDS label.

βš™οΈ Test Data and/or Report

Before:

$ .venv/bin/pds-deep-archive --debug --site PDS_ATM --bundle-base-url https://pds-atmospheres.nmsu.edu/PDS/data/ /Users/kelly/Downloads/vo/vo_3002/bundle_voirtm.xml
INFO πŸ‘Ÿ PDS Deep Archive, version 1.1.2
…
INFO πŸŽ‰ Success! From /Users/kelly/Downloads/vo/vo_3002/bundle_voirtm.xml, generated these output files:
INFO πŸ“„ SIP Manifest: vo_irtm_v1.0_20230404_sip_v1.0.tab
INFO πŸ“„ XML label for the SIP: vo_irtm_v1.0_20230404_sip_v1.0.xml
INFO πŸ‘‹ That's it for now. Bye.

After:

$ .venv/bin/pds-deep-archive --debug --site PDS_ATM --bundle-base-url https://pds-atmospheres.nmsu.edu/PDS/data/ /Users/kelly/Downloads/vo/vo_3002/bundle_voirtm.xml
INFO πŸ‘Ÿ PDS Deep Archive, version 1.1.3
…
CRITICAL πŸ›‘ Cannot proceed as a critical problem has occurred; re-run with --debug for more info.
DEBUG πŸ–₯ Here is the exception: ValueError('Bundle /Users/kelly/Downloads/vo/vo_3002/document/dataset.xml contains a <file_name> ``../catalog/dataset.cat`` which contains a relative path ``..``, which is invalid')
…

♻️ Related Issues

nutjob4life commented 1 year ago

Thanks @alexdunnjpl @c-suh! πŸŽ‰

c-suh commented 1 year ago

@nutjob4life thank you for the quick turnaround! Much appreciated.

jordanpadams commented 1 year ago

@nutjob4life question: does deep-archive provide any support for directory_path_name? according to the spec, this is how someone could specify a path to a file not in the current directory.

alexdunnjpl commented 1 year ago

@jordanpadams that attribute doesn't explicitly allow for relative paths either (as it also requires a root/cwd to be meaningful, which isn't mentioned)... was it definitely intended to support relpaths?

nutjob4life commented 1 year ago

@jordanpadams there is support for directory_path_name; see:

https://github.com/NASA-PDS/deep-archive/blob/b398f8dfd9c44bf87837acda2ccab833ca738716/src/pds2/aipgen/utils.py#L200-L204

jordanpadams commented 1 year ago

@alexdunnjpl I was thinking the same thing. you can start getting really weird with things like ../../../../../path/to/file.dat. this may require an update/clarification to the standard