SETI / rms-pdsfile

pdsfile Python module
Apache License 2.0
0 stars 0 forks source link

Are CORSS VERSIONS rules correct? #8

Open rfrenchseti opened 9 months ago

rfrenchseti commented 9 months ago

From rms-webtools created by rfrenchseti: SETI/rms-webtools#56

The VERSIONS part of rules/CORSS_8xxx.py has the following code:

    (r'volumes/CORSS_8xxx(|_v[0-9\.]+)/(CORSS_8...)/(\w+)(|/.*)', 0,
            [r'volumes/CORSS_8xxx*/\2/#LOWER#\3\4',
             r'volumes/CORSS_8xxx*/\2/#LOWER#\3#MIXED#\4',
             r'volumes/CORSS_8xxx_v1/\2/#UPPER#\3\4',
             r'volumes/CORSS_8xxx_v1/\2/#UPPER#\3#MIXED#\4',
            ]),

The last two lines duplicate the results from the first two, except they also capitalize the REV prefix. When enumerating version files, this results in things like:

'/volumes/pdsdata-admin/holdings/volumes/CORSS_8xxx_v1/CORSS_8001/EASYDATA/REV07E_RSS_2005_123_X43_E/RSS_2005_123_X43_E_CAL.TAB'
'/volumes/pdsdata-admin/holdings/volumes/CORSS_8xxx_v1/CORSS_8001/EASYDATA/Rev07E_RSS_2005_123_X43_E/RSS_2005_123_X43_E_CAL.TAB'

There is code to de-dup lists like this using the Python set() constructor, but this de-dup is case-sensitive and thus both examples of the file end up being present (see, e.g. PdsFile.all_versions()). Usually this is caught in a later phase of PdsFile, but it causes a warning to be logged (which we don't usually see because we don't have PdsFile logging turned on).

The reason I found this is it changes the code coverage for the PdsFile tests when they are run against Linux-vs-Mac filesystems.

There is no other case where we have this problem, leading me to believe the VERSIONS for CORSS are incorrect in this instance.