spacetelescope / stdatamodels

https://stdatamodels.readthedocs.io
Other
5 stars 24 forks source link

schema editor output depends on order of platform dependent `os.listdir` output #225

Closed braingram closed 9 months ago

braingram commented 11 months ago

One of the jwst regression tests in test_schema_editor is currently failing when run on a mac: https://github.com/spacetelescope/jwst/blob/749f224c39e8f4692e812d4431200e6edc288ca7/jwst/regtest/test_schema_editor.py#L102

The produced error is:

>       assert text_diff(rt.output, rt.truth)
E       AssertionError: 
E       --- /result
E       +++ /truth
E       @@ -1221,9 +1221,9 @@
E                    type: number
E                    fits_keyword: DITH_DEC
E                  subpixel_number:
E       -            title: Subpixel pattern number
E       -            type: integer
E       -            default: 1
E       +            title: Subpixel sampling pattern number
E       +            type: integer
E       +            default: 0
E                    fits_keyword: SUBPXNUM
E              ephemeris:
E                title: JWST ephemeris information

The meta data meta.dither.subpixel_number is defined in a number of keyword files:

The titles and default values are inconsistent across these files. To generate the above schema (that is failing the comparison against the truth file) the schema_editor: Combines all top level files (based on the order they are returned from os.listdir which is platform dependent): https://github.com/spacetelescope/stdatamodels/blob/15e5ae69d75658138960ecafac59115aad376eb6/src/stdatamodels/jwst/datamodels/schema_editor.py#L231-L232 This produces a 'schema' with an allOf combiner for meta.dither with references to the above (and other keyword files). This can be inspected by setting a breakpoint after line 985 below: https://github.com/spacetelescope/stdatamodels/blob/15e5ae69d75658138960ecafac59115aad376eb6/src/stdatamodels/jwst/datamodels/schema_editor.py#L985-L987 However, the call to create_dict on line 986 appears to have a bug wherein merge_schemas modified the schemas provided as argument (See: https://github.com/spacetelescope/stdatamodels/issues/226). When schema_editor goes to generate the output schema, it only selects values from the first matching item in an allOf as seen in this portion of get_keyword_value: https://github.com/spacetelescope/stdatamodels/blob/15e5ae69d75658138960ecafac59115aad376eb6/src/stdatamodels/jwst/datamodels/schema_editor.py#L1204-L1209 which makes the output schema depend on the ordering of items in the allOf which is determined by the order of files from listdir which is arbitrary.