SETI / rms-pds4indextools

PDS4 index tools Python module
Apache License 2.0
0 stars 0 forks source link

--limit-xpaths-file isn't really in glob format #23

Closed rfrenchseti closed 1 month ago

rfrenchseti commented 1 month ago

In pds4_create_xml_index the current code uses Python's fnmatch to match wildcard characters with lines in the --limit-xpaths-file file. However, fnmatch doesn't actually implement the ** concept provided by glob, even though we claim it does in the documentation. Instead it treats each string as just a filename, and thus * reaches across "directories". As a result, the XPath /a/b/c would be matched by *, but it should only be matched by */*/* or ** or **/*. To fix this we need to implement our own version of glob that breaks the XPaths into sections at / boundaries and calls fnmatch on each section recursively. The code to do this can be stolen from the glob source code and modified appropriately.

https://github.com/python/cpython/blob/3.12/Lib/glob.py