common-workflow-language / schema_salad

Semantic Annotations for Linked Avro Data
https://www.commonwl.org/v1.2/SchemaSalad.html
Apache License 2.0
72 stars 62 forks source link

Allowing type duplications is not documented in the spec #716

Open tom-tan opened 1 year ago

tom-tan commented 1 year ago

The spec of Apache Avro says:

Unions may not contain more than one schema with the same type, except for the named types record, fixed and enum.

As described in the spec of SALAD, SALAD is based on Apache Avro.

Salad builds on JSON-LD and the Apache Avro data serialization system

Also, there are few descriptions of union types in the spec of SALAD.

Therefore, we expect that union types behave as same as Apache Avro. That is, SALAD does not allow type duplications in union types. However, schema-salad-tool allows type duplications.

For example:

$graph:
- name: Foo
  type: record
  fields:
    - name: field1
      type: [string, string]
$ schema-salad-tool schema.yml 
/home/vscode/.local/bin/schema-salad-tool Current version: 8.4.20230606143604
Schema 'schema.yml' is valid

Is it an intended behavior? If intended, it would be nice if it is clarified in the spec of SALAD.

tetron commented 1 year ago

Allowing duplicated types is not specifically intended behavior, but nothing in the current code specifically checks for uniqueness, so it isn't disallowed either. It has a minimal impact on correctness. If we want to explicitly disallow it, we can.

mr-c commented 1 year ago

I'm happy to accept a PR to formally allow type duplicates in union types.

tom-tan commented 1 year ago

OK, I will send a request to clarify it.