Oblongs commented 2 months ago

Background

The CDM Asset Refactoring Taskforce is working to enhance the modelling of financial assets in the CDM.
This includes adding additional data types to the existing product model which will introduce additional levels into the hierarchy and increased use of the “one of” syntax to provide conditional selection of multiple subsidiary product types.
- Example
An example of a new data type introduced in the refactoring is Asset (simplified):

  type Asset
    basket Basket (0..1)
    loan Loan (0..1)
    security Security (0..1)
       condition: one-of

The sub-types in the Asset definition will contain common attributes, for example an identifier.
Rationale
The proposed enhancements to the DSL will:
- Increase understanding of modularity in models built using Rune.
- Improve readability of Rune DSL code.
- Differentiate one-of selections from other data types.
- Reduce long path traversals in complex constructs.
- Enable ease-of-use enhancements in DSL tools.
  Requirements
  1. Simplify the one-of construct by enabling a special kind of data type which:
    - Is composed of two or more constituents (which are themselves defined as data types).
    - Contains a choice of one and only one of the constituents that can be used in any instance.
    - Restricts the cardinality of all constituents to one.
    - Imposes that the name of a constituent (i.e. the attribute) is the same as that of the data type.
  2. Where multiple child data types of a parent data type have common attributes, it should be possible to access the common attributes directly.
    - For example, where identifier is common across all Asset sub-types, the current syntax would require IF…THEN logic to identify which sub-type is present, e.g.:

  if basket exists
  then asset -> basket -> identifier
  else if loan exists
    then asset -> loan -> identifier
    else asset -> security -> identifier

A simpler access path is required, for example:

asset -> identifier

Summary Deck

The attachment documents these requirements.

RUNE DSL Enhancement for One Of.pptx

Oblongs commented 2 months ago

SimonCockx commented 1 month ago

Here is a comparison of three different solutions. I evaluate each of them based on the following five questions.

How is the type represented in the model? How are common attributes represented? Is there duplication?
How can one access common attributes in an expression?
How is the type represented in serial form (JSON)?
How can one discriminate between different types?
Is it possible to guarantee that all cases are covered? (e.g., by red-underlying the Rune code when a modeller forgot to cover a case)

All comparisons are based on a toy version of the Asset (Basket/Loan/Security) model.

Enhance the current solution (`one-of`)

Summary:

Common attributes are duplicated for each one-of case.
Accessing common attributes is verbose. A proposed enhancement (->>) can eliminate this verbosity.
Serialisation is nested. The upside is that the field name allows us to discriminate between different types during deserialisation.
Discriminating types and accessing their attributes is verbose. A proposed enhancement (switch) can eliminate this problem.
No validation to guarantee all cases are covered. A proposed enhancement (switch) can eliminate this problem.

Model

type Asset:
  basket Basket (0..1)
  loan Loan (0..1)
  security Security (0..1)

  condition Choice:
    one-of

type Basket:
  identifier string (1..1)
  basketAttribute int (0..1)

type Loan:
  identifier string (1..1)
  loanAttribute number (0..*)

type Security:
  identifier string (1..1)
  securityAttribute date (2..2)

Access common attributes

if asset -> basket exists
then asset -> basket -> identifier
else if asset -> loan exists
then asset -> loan -> identifier
else if asset -> security exists
then asset -> security -> identifier

Access common attributes (proposed enhancement: automatically detect common attributes and make them directly available)

// Option 1: implicit
asset -> identifier
// Option 2: explicit
asset ->> identifier

Serialisation (of a basket)

{
  "basket": {
    "identifier": "abc123"
    "basketAttribute": 42
  }
}

Discriminate types

if asset -> basket exists
then // do something with asset -> basket -> basketAttribute
else if asset -> loan exists
then // do something with asset -> loan -> loanAttribute
else if asset -> security exists
then // do something with asset -> security -> securityAttribute

Discriminate types (proposed enhancement: switch over one-of types to guarantee coverage and reduce verbosity)

switch asset
  case basket then // do something with basket -> basketAttribute
  case loan then // do something with loan -> loanAttribute
  case security then // do something with security -> securityAttribute

Using `extends`

Summary:

Common attributes are not repeated. Good!
Accessing common attributes is concise. Good!
Cannot determine the actual type when deserialising. A proposed enhancement (@type) can eliminate this problem.
Need a way to discriminate between different types, either via an of-type operation, or via "dispatching".
Validation to guarantee all cases are covered is impossible since subtyping is open for extension.

Model

type Asset:
  identifier string (1..1)

type Basket extends Asset:
  basketAttribute int (0..1)

type Loan extends Asset:
  loanAttribute number (0..*)

type Security extends Asset:
  securityAttribute date (2..2)

Access common attributes

asset -> identifier

Serialisation (of a basket) (with proposed solution to determine the actual type)

{
  "@type": "Basket"
  "identifier": "abc123",
  "basketAttribute": 42
}

Discriminate types (new feature!)

// Option 1: à la `instanceof` (with type narrowing?)
if asset of-type Basket
then // do something with asset -> basketAttribute
else if asset of-type Loan
then // do something with asset -> loanAttribute
else if asset of-type Security
then // do something with asset -> securityAttribute
// else ... (potentially add a default case here)

// Option 2: dispatching
dispatch func DoTheThing(asset Asset):
  output: result Foo (1..1)

  // set result: ... potentially add a default case here

implement func DoTheThing(basket Basket):
  set result: // do something with basket -> basketAttribute

implement func DoTheThing(loan Loan):
  set result: // do something with loan -> loanAttribute

implement func DoTheThing(security Security):
  set result: // do something with security -> securityAttribute

Introducing a new `union` type (alternative: `choice` type)

Summary:

Common attributes are duplicated for each union case.
Accessing common attributes is concise. Good!
Need to serialise the type name as well.
Discriminating types is concise. Good!
Validation to guarantee coverage of all cases is possible. Good!

Model

union Asset:
  Basket
  Loan
  Security

type Basket:
  identifier string (1..1)
  basketAttribute int (0..1)

type Loan:
  identifier string (1..1)
  loanAttribute number (0..*)

type Security:
  identifier string (1..1)
  securityAttribute date (2..2)

Access common attributes

asset -> identifier

Serialisation (of a basket)

{
  "@type": "Basket"
  "identifier": "abc123",
  "basketAttribute": 42
}

Discriminate types

switch asset
  case Basket then // do something with asset -> basketAttribute
  case Loan then // do something with asset -> loanAttribute
  case Security then // do something with asset -> securityAttribute

SimonCockx commented 1 month ago

Based on this analysis, I would propose the following.

The end goal is to support union types as described in the last section.
As a migration strategy, we first support the two enhancements (-> for nested attributes and switch) for one-of types.

Oblongs commented 1 month ago

Just a clarification on the model, as we are currently writing it, will look like this with the common elements in AssetBase:

union Asset:
  Basket
  Loan
  Security

type AssetBase:
  identifier string (1..1)

type Basket extends AssetBase:
  basketAttribute int (0..1)

type Loan extends AssetBase:
  loanAttribute number (0..*)

type Security extends AssetBase:
  securityAttribute date (2..2)

Does this change your analysis at all?

In fact, we have also made this slightly worse, as follows

type AssetBase:
  identifier assetIdentifier (1..1)

type AssetIdentifier extends Identifier:
  identifierType assetIdTypeEnum (1..1)

type Identifier:
  identifier string (1..1)

Which means, as it currently stands, when we need to reference an identifier, we need to do this:

basket -> identifier -> identifier

So there is an even stronger case for

asset ->> identifier

Oblongs commented 1 month ago

On the proposed migration strategy, can we leverage Minesh’s “pre-processing” concept to implement union on the front end that is actually implemented as one-of in the DSL? That is:

View in Rosetta:

union Foo:     
  Bar1
  Bar2
  Bar3

Implementation

type Foo:
  bar1 Bar1 (0..1)
  bar2 Bar2 (0..1)
  bar3 Bar3 (0..1)
    condition: one-of

SimonCockx commented 1 month ago

Just a clarification on the model, as we are currently writing it, will look like this with the common elements in AssetBase:
union Asset:
  Basket
  Loan
  Security

type AssetBase:
  identifier string (1..1)

type Basket extends AssetBase:
  basketAttribute int (0..1)

type Loan extends AssetBase:
  loanAttribute number (0..*)

type Security extends AssetBase:
  securityAttribute date (2..2)
Does this change your analysis at all?

This should work fine!

In fact, we have also made this slightly worse, as follows
type AssetBase:
  identifier assetIdentifier (1..1)

type AssetIdentifier extends Identifier:
  identifierType assetIdTypeEnum (1..1)

type Identifier:
  identifier string (1..1)
Which means, as it currently stands, when we need to reference an identifier, we need to do this:
basket -> identifier -> identifier
So there is an even stronger case for
asset ->> identifier

Hm, the current proposal adds ->> support for one-of and union types only. Since the type Identifier is neither, it wouldn't be possible to do that. What you describe here seems like a different use case as the ones in the original issue. Is this another requirement? Are there alternatives? E.g., typeAlias Identifier: string.

On the proposed migration strategy, can we leverage Minesh’s “pre-processing” concept to implement union on the front end that is actually implemented as one-of in the DSL? That is:

View in Rosetta:
union Foo:     
  Bar1
  Bar2
  Bar3 
Implementation
type Foo:
  bar1 Bar1 (0..1)
  bar2 Bar2 (0..1)
  bar3 Bar3 (0..1)
    condition: one-of

Currently investigating this. I took a quick look together with Minesh, and we came to the conclusion that it's easier said than done. There is a path I haven't explored yet - more to follow.

lolabeis commented 1 month ago

Proposal looks good. A few comments and questions:

In the 3rd proposal (union) when you use switch, does it assume that the path starts at (in the Basket case) asset -> basket, so you would directly start typing basketAttribute?
Can some of a union underlying types be a union as well? In this case, can you specify how the switch statement, which likely needs nesting, would work? Please use the following example:
```
union Instrument:
Security
Loan
```

union Asset: Basket Instrument


3. Would you allow the following expression: `asset -> basket -> basketAttribute` (which means the DSL must associate some default name to each attribute), or would you only allow `switch` statements or calling the common attributes on a `union` type?
4. Finally I think the `choice` naming alternative is more appropriate than `union` indeed.

lolabeis commented 1 month ago

@Oblongs With regards to your point about:

basket -> identifier -> identifier

I don't think the proposal would allow simply to shorten as:

asset ->> identifier

Instead, the approach we discussed to eliminate the extra level on this one is to define:

AssetBase extends Identifier:
  identifierType assetIdTypeEnum (1..1)

But it's separate from the issue being discussed here.

lolabeis commented 1 month ago

@SimonCockx There is another requirement that we'd like you to consider. Although it's another "killer-feature", it's independent from the above and not on the critical path of migration.

When defining a union, it should be possible to declare an associated enum:

union Asset:
  Basket
  Loan
  Security
  as-enum AssetTypeEnum

And then it would be possible to use AssetTypeEnum as if it was explicity declared, e.g.:

type Collateral
  assetType AssetTypeEnum

Also how would that work in the "nested" union case?

SimonCockx commented 1 month ago

@lolabeis Great points, some of which I have been consciously "forgetting", given they were not listed as requirements yet.

Proposal looks good. A few comments and questions:

In the 3rd proposal (union) when you use switch, does it assume that the path starts at (in the Basket case) asset -> basket, so you would directly start typing basketAttribute?

I see two options here. Just to recapitulate, the question is: how do I access basketAttribute in the following location? (xxx)

switch asset
  case Basket then xxx
  ...

Either:

In each branch, we use item to indicate asset with its type narrowed down to the specific case, e.g., Basket. This would mean you could directly type basketAttribute, which would be syntactic sugar for item -> basketAttribute:

switch asset
  case Basket then basketAttribute
  ...

In case you are switching on a more complex expression, this improves conciseness even more, e.g.,

switch this -> is -> some -> long -> path -> asset
  case Basket then basketAttribute        // instead of having to write this -> is -> some -> long -> path -> asset -> basketAttribute
  ...

One potential downside is that it redefines item, which when combined with other operations that define item (such as extract) can be confusing. Fictive example:

reportableEvents
  extract switch reportableInformation -> asset // suppose `reportableInformation` has an asset attached to it.
    case Basket then Process(basketAttribute, reportableInformation) // This won't work: reportableInformation suddenly becomes "unavailable". A modeller would have to name their `reportableEvent` explicitly. 
    ...

In each branch, the type of asset will change to the actual narrower type, e.g., Basket. This would mean a modeller would have to refer to asset -> basketAttribute to access to attribute:
```
switch asset
  case Basket then asset -> basketAttribute
  ...
```
When switching over long expressions, this could be cumbersome, although rewriting the expression can be avoided using extract, e.g.,
```
this -> is -> some -> long -> path
  extract
    switch asset
      case Basket then asset -> basketAttribute
      ...
```

Currently I'm leaning towards the first option.

Can some of a union underlying types be a union as well?

Yes, each of the union cases can be of any type, including data types, enumerations, basic types and other union types. In the long term I see additional benefits such as being able to conform to regulations that require us to either output a number or a string, e.g.,

union NumberOrString:
  number
  string

type Foo:
  bar NumberOrString (1..1)

which can then be serialised into

{
  bar: 42
}

or

{
  bar: "42 ounces"
}

This is something which currently is impossible to model with Rune, and for which clients have asked support for in the past.

In this case, can you specify how the switch statement, which likely needs nesting, would work? Please use the following example:
union Instrument:
  Security
  Loan

union Asset:
  Basket
  Instrument

Great question! I think supporting a "flat" switch, even for nested unions, will be the most easy to read and write, so you would be able to do something like

switch asset
  case Basket then ...
  case Security then ...
  case Loan then ...
// or, if only the common attributes of Security and Loan are relevant:
switch asset
  case Basket then ...
  case Instrument then ...

Potentially, they could also be "mixed" to provide default cases for nested unions. Suppose that Instrument had a third option called AnotherInstrument, then one could write something like this:

switch asset
  case Basket then ...
  case Security then ... // this catches the first case of `Instrument`
  case Instrument then ... // this catches all other `Instrument` cases, i.e., `Loan` and `AnotherInstrument`

Note that the order of cases then starts to matter. Writing case Instrument and then case Security should be forbidden by the DSL, since the latter case will never be reached.

Would you allow the following expression: asset -> basket -> basketAttribute (which means the DSL must associate some default name to each attribute), or would you only allow switch statements or calling the common attributes on a union type?

This would be part of the migration strategy, but in the end I would disallow this kind of direct access of attributes that are not common, unless a compelling use case arises.

Finally I think the choice naming alternative is more appropriate than union indeed.

To give it a try, I will use choice in my following responses. :)

SimonCockx commented 1 month ago

On the proposed migration strategy, can we leverage Minesh’s “pre-processing” concept to implement union on the front end that is actually implemented as one-of in the DSL?

Update on this one: this is starting to look promising. We will probably follow this strategy as a quick win, and then incrementally start improving it.

SimonCockx commented 1 month ago

@SimonCockx There is another requirement that we'd like you to consider. Although it's another "killer-feature", it's independent form the above and not on the critical path of migration:

When defining a union, it should be possible to declare an associated enum:
union Asset:
  Basket
  Loan
  Security
  as-enum AssetTypeEnum
And then it would be possible to use AssetTypeEnum as if it was explicity declared, e.g.:
type Collateral
  assetType AssetTypeEnum
Also how would that work in the "nested" union case?

Interesting. I think this wouldn't be too hard to add. Like you mention, the trickiness lies in how to handle nested choice types. To take your example from before:

choice Instrument as-enum InstrumentEnum: // another syntax suggestion
  Security
  Loan

choice Asset as-enum AssetEnum:
  Basket
  Instrument

I think the most useful interpretation is to flatten again. I assume the use case of representing a choice type as an enum is to indicate the actual type of an instance. Since an actual Asset will always be either a Basket, a Security or a Loan, and never an Instrument, I think that should be the case for the enum as well. I.e., AssetEnum would be equivalent to the following.

enum AssetEnum:
  Basket
  Security
  Loan

It depends on the use case of course. My interpretation could be wrong.

But perhaps it's best to continue this discussion in a separate issue.

SimonCockx commented 1 month ago

I would like to add an alternative switch syntax to the discussion that @lolabeis proposed, which I also quite like:

asset switch
  Basket then <expr>,
  Security then <expr>,
  Loan then <expr>,
  <default expr> // this is optional

This would be better aligned with other operators such as extract.

SimonCockx commented 1 month ago

One "use case" I didn't add to the comparison, but which would have been useful, is how to go from a specific type to a choice type, e.g., given a function that accepts an Asset as input, and given a variable of type Basket, how do I call this function?

Current solution (`one-of`)

Need to wrap it in an Asset constructor.

ProcessAsset(Asset { basket: basket, ... })

This is cumbersome!

Using `extends`

Works out of the box:

ProcessAsset(basket)

Using the proposed `choice` types

Works out of the box. No need to wrap!

ProcessAsset(basket)

SimonCockx commented 1 month ago

Summary of the implementation plan

Below are four steps to get us from the current state to full support for choice types.

1. Choice types are supported as syntactic sugar of `one-of` types.

This is a quick win. E.g.,

choice Asset:
  Basket
  Loan
  Security

is syntactic sugar to

type Asset:
  Basket Basket (0..1)
  Loan Loan (0..1)
  Security Security (0..1)

  condition Choice:
    one-of

2. Killer-feature: common nested attributes of `one-of` (and `choice` types) can be accessed via a new `->>` operator.

E.g.,

asset ->> identifier

This should also work for attributes that are nested with multiple levels of one-of types.

3. Support dedicated choice types (remove syntactic sugar), add support for `switch` expressions, support accessing common attributes with `->`, and support (de)serialisation with `@type` (except for basic types).

A couple of things change at this point.

A basket can now directly be passed to a function expecting an asset, instead of wrapping it in Asset { basket: basket, ... }. The Asset {...} constructor syntax disappears.
Accessing common attributes in choice types changes from ->> to ->. For one-of types, the ->> syntax stays the same.
Discriminating a choice type now must happen by using a switch. All checks of the form if asset -> basket exists then will need to be refactored. E.g.,
```
asset switch
  Basket then <item is now of type Basket>,
  Loan then <item is now of type Loan>,
  Security then <item is now of type Security>,
  default then <optional default case>
```
Nested choice types are flattened out.

Serialisation becomes less nested. E.g., what used to be

{
  "Basket": {
    "identifier": "abc123"
    "basketAttribute": 42
  }
}

now becomes

{
  "@type": "Basket",
  "identifier": "abc123",
  "basketAttribute": 42
}

Note that this can only be done after the migration to Translate 2.0.

4. Killer-feature: add support for using choice types as enum.

E.g.,

choice Asset as-enum AssetTypeEnum:
  ...

type Underlier:
  assetType AssetTypeEnum (1..1)

In an expression:

if underlier -> assetType = AssetTypeEnum -> Basket
then ...

lolabeis commented 1 month ago

I would like to add an alternative switch syntax to the discussion that @lolabeis proposed, which I also quite like:
asset switch
  Basket then <expr>,
  Security then <expr>,
  Loan then <expr>,
  <default expr> // this is optional
This would be better aligned with other operators such as extract.

Fully support this, and I was about to suggest it 😄.

I think this would also allow you to do nested choice more elegantly - and I suggest using square bracket [] to be consistent with nesting of list operators. Re-using the same example as above :

asset switch
  Basket then <expr>,
  Instrument then switch [
    Security then <expr>,
    Loan then <expr>
    ],
  ...

lolabeis commented 1 month ago

Currently I'm leaning towards the first option.

Agree with this.

With regards to the issue of redefining item, I think it's consistent with the nesting of list operators: to access a previously defined item, it must be named.

lolabeis commented 1 month ago

Also with the switch syntax now redefined to be aligned onto the list operator syntax, all of the below should be allowed.

Direct attribute access (in line with simpler rule syntax):

asset switch
  Basket then basketAttribute -> ... ,
  Security then securityAttribute -> ... ,
  Loan then loanAttribute -> ...

Using default item:

asset switch
  Basket then item -> basketAttribute -> ... ,
  Security then item -> securityAttribute -> ... ,
  Loan then item -> loanAttribute -> ...

Using named item:

asset switch a [
  Basket then a -> basketAttribute -> ... ,
  Security then a -> securityAttribute -> ... ,
  Loan then a -> loanAttribute -> ...
  ]

lolabeis commented 1 month ago

But perhaps it's best to continue this discussion in a separate issue.

Agree, let's start a separate issue.

lolabeis commented 1 month ago

One "use case" I didn't add to the comparison, but which would have been useful, is how to go from a specific type to a choice type, e.g., given a function that accepts an Asset as input, and given a variable of type Basket, how do I call this function?

Current solution (one-of)

Need to wrap it in an Asset constructor.
ProcessAsset(Asset { basket: basket, ... })
This is cumbersome!

Using extends

Works out of the box:
ProcessAsset(basket)
Using the proposed choice types

Works out of the box. No need to wrap!
ProcessAsset(basket)

So you could pass a variable of type basket to a function that takes Asset as input - this is cool!

lolabeis commented 1 month ago

I think supporting a "flat" switch, even for nested unions, will be the most easy to read and write

With the way you redefined the switch syntax, it allows you to do nesting more easily - See above ☝️.

Your flat switch suggestion works and is quite concise, but at the expense of introducing an ordering concern, as you point out. It involves a little magic, whereas the explicit switch nesting is more transparent.

lolabeis commented 1 month ago

Below are four steps to get us from the current state to full support for choice types.

Your implementation plan looks sensible. There is potentially a step 5, where we may be able to get rid of the one-of syntax (and consequently of ->>) altogether, if we manage to replace all occurences using choice - TBD.

Oblongs commented 1 month ago

The inverse scenario to defining a choice as also available as an enum also exists.

We already have this enum (simplified):

enum currencyEnum:
   EUR
   GBP
   USD

It might be interesting to be able to say

enum currencyEnum as-choice Cash:
   EUR
   GBP
   USD

Of course, it would be possible to refactor currencyEnum to become a choice data type with as-enum but its primary use will be as an enumerator and only edge case as a choice data type.

SimonCockx commented 1 month ago

Thanks for all of the feedback. Since there is a consensus for the initial plan (steps 1 and 2 of https://github.com/finos/rune-dsl/issues/747#issuecomment-2105050958), I will start development for those. Once we get to a stage were we can start the rest of the proposal, we can summarise and continue these threads in a separate issue.

finos / rune-dsl

Enhancements to one-of construct #747

Background

Rationale

Requirements

Summary Deck

Enhance the current solution (`one-of`)

Using `extends`

Introducing a new `union` type (alternative: `choice` type)

Current solution (`one-of`)

Using `extends`

Using the proposed `choice` types

Summary of the implementation plan

1. Choice types are supported as syntactic sugar of `one-of` types.

2. Killer-feature: common nested attributes of `one-of` (and `choice` types) can be accessed via a new `->>` operator.

3. Support dedicated choice types (remove syntactic sugar), add support for `switch` expressions, support accessing common attributes with `->`, and support (de)serialisation with `@type` (except for basic types).

4. Killer-feature: add support for using choice types as enum.

Current solution (`one-of`)

Using `extends`

Using the proposed `choice` types

finos / rune-dsl

Enhancements to one-of construct #747

Background

Rationale

Requirements

Summary Deck

Enhance the current solution (one-of)

Using extends

Introducing a new union type (alternative: choice type)

Current solution (one-of)

Using extends

Using the proposed choice types

Summary of the implementation plan

1. Choice types are supported as syntactic sugar of one-of types.

2. Killer-feature: common nested attributes of one-of (and choice types) can be accessed via a new ->> operator.

3. Support dedicated choice types (remove syntactic sugar), add support for switch expressions, support accessing common attributes with ->, and support (de)serialisation with @type (except for basic types).

4. Killer-feature: add support for using choice types as enum.

Current solution (one-of)

Using extends

Using the proposed choice types

Enhance the current solution (`one-of`)

Using `extends`

Introducing a new `union` type (alternative: `choice` type)

Current solution (`one-of`)

Using `extends`

Using the proposed `choice` types

1. Choice types are supported as syntactic sugar of `one-of` types.

2. Killer-feature: common nested attributes of `one-of` (and `choice` types) can be accessed via a new `->>` operator.

3. Support dedicated choice types (remove syntactic sugar), add support for `switch` expressions, support accessing common attributes with `->`, and support (de)serialisation with `@type` (except for basic types).

Current solution (`one-of`)

Using `extends`

Using the proposed `choice` types