common-workflow-language / schema_salad

Semantic Annotations for Linked Avro Data
https://www.commonwl.org/v1.2/SchemaSalad.html
Apache License 2.0
73 stars 62 forks source link

Backported schemas to `codegen` branch may break portability between platforms due to handling `typeDSL` #863

Open tom-tan opened 2 months ago

tom-tan commented 2 months ago

I'm not sure here is an appropriate place for this issue but I report it here because it is related to saladVersion.

Historically, in my understanding, the saladVersion field was introduced not to break the compatibility when implementing nested typeDSL. That is:

At that time, we also discussed to introduce another field to control it but finally it was rejected.

By the way, saladVersion v1.3 also introduces map and union schemas to represent input objects and they are backported to the schemas in the codegen branch of CWL v1.0-v1.2.

That is, the code-gerenated parsers with backported schemas in saladVersion v1.3 accidentally support nested typeDSL. Not to break the portability between platforms, it would be nice if we can fix this issue.

There are several ways to fix it.

  1. Introduce a new field to control nested typeDSL
    • It was discussed when implementing nested typeDSL but finally we decided to introduce saladVersion instead
    • IMO this option is worth discussed again
  2. Give up to backport map and union schemas to CWL v1.0-v1.2
    • It may become complex to implement and maintain the platforms for CWL v1.0-v1.2
  3. Support nested typeDSL in CWL v1.0-v1.2
    • IMO we cannot use this option because it is a breaking change to the spec
  4. Others?

What do you think?

tom-tan commented 1 month ago

This is a blocker of #861. I prefer the option 1, that is, to introduce a new field (e.g., mustExpand?) to control nested typeDSL syntax.

If it is OK to take this approach, I will create new issues for each code generators.

tom-tan commented 4 weeks ago

How about extending the typeDSL field to accept an object with the mustExpandRecursively field, for example? Here is possible values of the typeDSL field:

value expand recursive note
null no - same as typeDSL: false
false no -
true yes platform dependent
{ mustExpandRecursively: false } yes platform dependent same as typeDSL: true
{ mustExpandRecursively: true } yes yes

It is consistent with the existing spec of typeDSL (the case of typeDSL is null, false, or true) but covers recursive expansions.

Notes:

mr-c commented 2 weeks ago

I agree that this is important and needs fixing, I also think that option 1 ("Introduce a new field to control nested typeDSL") is the correct method.

mr-c commented 2 weeks ago

Since we have total control of the codegen branches, as they are unofficial, I suggest removing the platform-dependent behavior and adding an expandRecursively field (or similar name) that can be missing, true, or false.

tom-tan commented 2 weeks ago

Here is an intended behaviors after introducing expandRecursively:

value expand recursive note
null no - same as typeDSL: false
false no -
true yes no
{ expandRecursively: false } yes no same as typeDSL: true
{ expandRecursively: true } yes yes

{ expandRecursively: true } will be used since CWL v1.3.