asdf-format / asdf

ASDF (Advanced Scientific Data Format) is a next generation interchange format for scientific data
http://asdf.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
511 stars 56 forks source link

Make asdf standard 1.6.0 stable (default) #1744

Open braingram opened 5 months ago

braingram commented 5 months ago

Description

I milestoned this PR for asdf 4.0 because it's a rather major change.

Making 1.6.0 stable will require: Phase 0 (at any time):

Phase 1:

Phase 2:

Phase 3:

Unfortunately at this point most of the schema repo CIs will be broken (as will asdf-astropy CI) due to the intertwined nature of these packages.

Phase 4:

Phase 5:

Much of the above is due to updates to the ndarray schema:

All schemas that $ref ndarray (like quantity) then need a version bump and so on...

This is further complicated by quantity-1.1.0 currently existing in 2 released packages:

(because of an incomplete effort to split unit fits table and time out of the core). The approach taken here is to decomission asdf-unit-schemas. asdf-standard will continue to provide updates to the unit (and other non-core) schemas. This seems sensible as these schemas are highly interdependent. More details can be found in: https://github.com/asdf-format/asdf-standard/pull/422

One question that occupied a lot of my thought was "should we change some of the $refs to tags?" On one hand this could make migrations like this easier (if every ndarray $ref was instead a wildcard tag (ndarray-1.*) most of these schemas would not need to be updated. However, this links the tag to the schema which has a few downsides:

For example unit/quantity contains a $ref link to unit/unit. This means that asdf-astropy can use a differently tagged unit (astropy/unint for non-vo units) and still produce a valid unit/quantity (see the wfi schema in rad as an example). If instead unit/quantity used a tag link to unit/unit, this same "duck typing" would not work with a differently tagged unit. asdf-astropy would have to instead:

At the moment I am of the mind that keeping the schemas as separate from the tags as possible is the better option (so $ref instead of tag). This allows the schemas to function even if they are treated as normal "jsonschema"s. Additionally the tag validator behavior seems loosely defined in the standard where it states "Implementation of this validator is optional and depends on details of the YAML parser." For similar reasons the above asdf-transform-schemas PR did not rely on the feature in asdf to use multiple schemas in a tag definition (to allow the many transforms that $ref the transform schema to instead include them in the manifest).

Both stdatamodels (datamodels) and rad use metaschemas based off of asdf-schema-1.0.0 (which is updated to 1.1.0 in https://github.com/asdf-format/asdf-standard/pull/422). As neither of these packages version schemas updating the metaschema version will force these packages to use exclusively the 1.6.0 standard (which will almost certainly cause issues if old versions of asdf/asdf-standard are used). Instead, I suggest we not update the metaschemas (and keep them using the old asdf-schema-1.0.0 metaschema). The only downside is the lack of float16 support in the datatype keyword validator. This seems like an acceptable limitation for the time being. Once asdf standard 1.6.0 is stable and the asdf version that sets it as the default agrees with the minimum required version for each of those packages the metaschemas can be updated.

Checklist: