common-workflow-language / cwl-v1.3

Apache License 2.0
4 stars 5 forks source link

[Proposal] Support `format` per respective `type: File` #52

Open fmigneault opened 2 months ago

fmigneault commented 2 months ago

At the moment, it is possible to do the following:

inputs:
  features:
    type:
      - "File"
      - type: array
        items: "File"
    format:
      - "ogc:FeatureCollection"
      - "iana:application/geo+json"
$namespaces:
  iana: "https://www.iana.org/assignments/media-types/"
  ogc: "http://www.opengis.net/def/glossary/term/"

However, it is NOT possible to do something like the following:

inputs:
  features:
    type:
      - type: "File"
        format: "ogc:FeatureCollection"
      - type: array
        items:
          type: "File"
          format: "iana:application/geo+json"
$namespaces:
  iana: "https://www.iana.org/assignments/media-types/"
  ogc: "http://www.opengis.net/def/glossary/term/"

The reasoning of the above is that the application receiving this features input can either receive a single document embedding of a "FeatureCollection" (a specific type of GeoJSON: https://geojson.org/schema/FeatureCollection.json), or an array of "single feature" GeoJSON (eg: https://geojson.org/schema/Feature.json).

The current limitation of format permitted only directly under the input makes it such that some combinations of formats are inconsistent with the desired use. In the first example, a single "Feature" GeoJSON could be submitted, or an array of "FeatureCollection" (resulting in a 2D array of features) could be submitted, both of which are invalid for the application that expect a "list of features".

Allowing to nest format under the specific type would allow to adequately narrow the structure to the valid combinations.

fmigneault commented 2 months ago

The following seems to be an appropriate workaround, but is much harder to interpret than format fields placed at the respective locations. This is also become more error-prone when the number of combinations increases. Finally, there might be situations where the Array.isArray used below might not be sufficient to distinguish between the cases, making the code harder to maintain.

inputs:
  features:
    type:
      - "File"
      - type: array
        items: "File"
    format: |
      ${ 
        if (Array.isArray(inputs.features)) {
          return "iana:application/geo+json";
        }
        return "ogc:FeatureCollection";
      }