w3c / market-data-odrl-profile

Rights Automation for Market Data W3C Community Group
Other
4 stars 6 forks source link

Distinguishing between Assets, Resources, & Sources #3

Closed benedictws closed 4 years ago

benedictws commented 4 years ago

Three questions to consider

  1. ODRL defines Assets. It does not define Resources, or Sources. Do we need these terms?
  2. Does a Resource maintain its identity under simple transformations?
  3. If so, what is a simple transformation?

Defining an asset in the context of ODRL is easy.

An asset is "a resource or a collection of resources that are the subject of a rule" (a rule being a permission, a prohibition, or a duty).

So an asset is a resource, resources, or the part of a resource controlled by a rule.

An asset can be complex. It may be real-time, or delayed; it may offer a full order book or a shallow one; it may just be French. In the language of ODRl, an asset can have constraints which describe exactly which "version" or "part" of the resource is controlled by a rule.

So what is a resource?

A resource is some commodity (e.g. API calls, bandwidth, data) access to which (or to some part of which) may be controlled.

Example: In the context of market data, a resource could be the all the pricing and trading data generated by the trade in one or more financial instruments on an exchange, or indices derived from that data.

But resources can be complex too. In the context of a supply chain, your asset is my resource, to slice and dice into a new asset as I see fit and profitable.

Scope note: Like Assets, Resources are sensitive to their means of delivery and informational completeness. A real-time version of that data may be a different Resource than the delayed. Book depth, for example, can distinguish between Resources.

So the only distinction between an asset and a resource is that the former sits at the top of the constraint hierarchy.

We can take a step towards this idea by saying an Asset is a sub-class of Resource, i.e. a particular type of Resource

This means that the definitions of both the asset and the resource can change along the supply chain. So will their names and identifiers. Of course, new resources must retain their link to the resources they constrain. But how can we anchor this chain on something invariant and universal?

Doing so helps us answer some key questions:

  1. How do I check which rules apply to any specific item of data?
  2. How can I connect the supplier rules I exercise to the customer rules I offer?
  3. How can I generate a catalogue of rules and resources that can be compared with other catalogues? One that makes sense.

To do so, we need invariants which stay the same through each stage of distribution (and commercialisation). In a sense, these are close at hand. We can talk of "Xetra", or of the "Eurodollar Futures Contract", the "The CAC 40". But what are these?

Well, we've got a venue, a financial instrument, and an index. But I think they are also our anchor resources. Whether it's realtime or delayed, it's still Xetra. Whether is the index value, or the constituent weightings, it's still the CAC 40. Perhaps we should give them a special name: Sources.

So what is a Source?

A Source is a root Resource: unconstrained and informationally complete, independent of the aspects of commercialisation and/or delivery including timeliness, method of delivery, or book depth.

If we have our invariants, all we need now are universal identifiers for them. Only one actor in the data supply chain can sensibly provide these: the data originators.

(It would be great if these could be (dereferenceable) URIs rather than strings or venue-dependent codes!)

There is one other aspect of resources that we should also clarify: does a resource remain the same resource (and therefore controlled by the same rules) if it undergoes a simple transformation, say changing the pricing data from dollars to euros?

If so, what are the criteria for being a simple transformation? Reversible and substitutive?

A resource can be transformed through some calculation but still remain the same resource. It only becomes a new resource if the transformation is irreversable (i.e. the original data cannot be recovered) and non-substitutive (i.e. the altered data cannot be used in place of the original).

joshcornejo commented 4 years ago

IMHO:

There is one other aspect of resources that we should also clarify: does a resource remain the same resource (and therefore controlled by the same rules) if it undergoes a simple transformation, say changing the pricing data from dollars to euros?

If so, what are the criteria for being a simple transformation? Reversible and substitutive?

A resource can be transformed through some calculation but still remain the same resource. It only becomes a new resource if the transformation is irreversable (i.e. the original data cannot be recovered) and non-substitutive (i.e. the altered data cannot be used in place of the original).

I would think that an irreversible transformation is one that alters the (maybe it is notional & maybe it is absolute?) value of the resource; i.e. real-time vs delayed for decision making?. There is a possibility of defining which properties of a resource are the ones that give it value and further as which proportion of the value comes from a property, for example, timeliness (could be absolute: in any transformation, or relative: from its current ‘shape’ onwards -) and then have the ability to do a comparison of each resource to figure the differences in value. The value grouping itself could also have a property to determine when the value is allowed to change/lapse as a constraint.

markabird commented 4 years ago

the only distinction between an asset and a resource is that the former sits at the top of the constraint hierarchy.

This use of the terms Resource and Asset seem to be at odds with their descriptions in the New Terms wiki: https://w3c.github.io/market-data-odrl-profile/NewTerms.html. There, we read:

A resource is independent of the aspects of commercialisation and/or delivery including timeliness, method of delivery, or book depth. These properties belong to the Asset. Whether it’s realtime or delayed data from a venue, it’s the same Resource but a different Asset.

As defined here, an Asset is a refinement of a Resource, and therefore can't be at the top of the hierarchy. I agree, though, that having two terms for the same item depending on where and how we're referring to it in the supply chain is important, but I think possibly we need two sets of terms.

markabird commented 4 years ago

Reading further, I see you've addressed my above comment by defining Source. Source is the "informationally complete" Resource at the top of the hierarchy before it's been refined.

markabird commented 4 years ago

Along with transformation, another aspect we need to consider is "quality loss." Some examples:

None of these actions qualify as transformation, I don't think. But they may result in licensing consequences.

One helpful way to think about it is that there may be one license needed to receive an asset, but a separate license to actually do something with it, like redistribute. This means that I can receive one Asset, and redistribute a different Asset, even without transformation being involved.

markabird commented 4 years ago

What are the criteria for being a simple transformation?

Reversible and substitutive is the standard throughout the industry, and works pretty well for something so nebulous. It unfortunately has a bit of a "I know it when I see it" quality, which means it will be hard to codify. Maybe trying to define all the things that can happen to an Asset without breaking it's connection to its base Resource (currency conversion, normalization, etc) wouldn't be impossible.

Then, anything that doesn't fall into one of those defined actions would default to being considered complex transformation.

primell commented 4 years ago

I think Resource and Asset are quite different. Or at least I feel we need something that describes the tangible thing, independent of its timeliness of delivery for example to another party, and if it’s not Asset then it should be Resource.

The Asset (described by elements of the rule around the Resource) then is a refinement of the Resource. Sometimes this refinement may be used to transform the Resource into a new Resource with those changed qualities/characteristics/values baked in somehow and/or described in its own meta-data.

For example delay times specified in a rule and relating to consumption or distribution rights may apply to market data received without delay. The same specified delay times and rights may apply to market data already delayed by the time of receipt. The Asset (a measure of the value to supplier and customer) around the Resource is the same in both cases and the Resource is the same.

The tangible Resource actually received may look the same in both cases above, or it might be different in some way (in format, in name or in other meta-data) The Asset is again the same in both cases but the Resource is now different. These are both real-world use cases today.

benedictws commented 4 years ago

I think we have the kernel of agreement here on the definition of Assets, Resources, and Sources.

We start with the Source which is informationally complete and offers the full and original value of the data. Let's call it S1.

The Originator of this data sells a version of this data to a Consumer by creating an Asset with a Constraint (C1 - say delayed by 15 minutes).

So: S1 (the resource) -> (is licensed as the asset) S1.C1.

Our Consumer then acts as a Provider and decides to add an additional constraint on the data C2 - say a book depth of just 1.

S1.C1 (the resource) -> (is licensed as the asset) S1.C1.C2

I'm suggesting that what is to the left of the arrow is a Resource. What is to the right is the Asset (controlled by a Rule).

This models the idea that resources and assets are relational - your asset is my resource. To the Originator S1.C1 is the Asset. S1 is the Resource

Likewise, to the Consumer S1.C1 is the Asset. Consumers don't create resources.

To the Provider S1.C1.C2 is the Asset. S1.C1 is the Resource.

To all, S1 remains the Source. It is invariant.

Does this sumarise the discussion? @primell - does this work for your use cases?

benedictws commented 4 years ago

On the question of transformations: let's start with the the class of transformations that do calculations on the data - Derivations.

(I'm putting simple operations like currency conversions and the normalisation of data in scope.)

It's clear that many derivations have an impact on licensing. But I'm interested in the sub-set that doesn't. Why? Because defining that subset may tell us when:

@markabird confirms that Reversible and Substitutive are standard criteria. But also points out that there is a "I know it when I see it" quality to the decision. I think he wisely suggests that we collect examples. So beyond currency conversions and normalisation, what should we list?

primell commented 4 years ago

Does this sumarise the discussion? @primell - does this work for your use cases?

Yes, I think it can work for those use cases.

There is one question I have … the role of a Provider being dependant on receipt of the Resource or Asset being always modelled as delivered to a Consumer first? I think its should be just and Originator to Provider relationship? The same parties can also have a different Originator to Consumer relationship with different terms. Did I understand you correctly there?

nvar commented 4 years ago

Usually, Resource is described as the thing that is uniquely identified. All Entities are Resources, so in that sense eg. the Crude Oil Future Dec2020 is a Resource. Are we saying that this will be considered a Source and not a Resource? Also for transformations, maybe we can use the is derived from relationship from Resource to Resource or from Asset to Asset to describe the provenance/lineage?

markabird commented 4 years ago

beyond currency conversions and normalisation, what should we list?

"Co-mingling" might be on the list. Combining two or more assets in their entirety into a single distribution channel would leave the underlying rights of the original assets intact even as I've created a new product.

markabird commented 4 years ago

beyond currency conversions and normalisation, what should we list?

Symbology probably fits in here. If I just change how an instrument is identified but nothing else, the original rights to the asset are maintained even as I've added value.

nvar commented 4 years ago

@benedictws some examples on what can be defined as a Resource or Asset. I think it is according to what you describe above but just to cross-check: Resource is anything that can be uniquely identified: Eg. Any Instrument, any Commodity, any Index, any Benchmark will be the same Resource though all its lifecycle. Example a Future Crude Oil Contract Dec 2020 will always be the same Resource R1. Asset: A Resource with or without additional information or constraint. Eg of 4 Assets on the same Resource: A1 Asset = R1, A2 = R1 with real time market price A3= R1 with delayed market price, A4 = R1 with order book depth 10 The fact that Resource is the same throughout the lifecycle provides consistency and integration, the fact that Asset can be considered anything that is processed through a rule gives us the flexibility If a Consumer wants to add some restrictions or more information on its consumed Asset, then he can create a new Asset on the same Resource. So Asset --> Consumer --> Asset Asset ---> Provider --> Asset

And depending on per case basis, based on transformation/ derivation rules of the information of the Resource, a new Resource can be created and connected with isDerivedFrom for provenance/lineage if needed.

benedictws commented 4 years ago

There is one question I have … the role of a Provider being dependant on receipt of the Resource or Asset being always modelled as delivered to a Consumer first? I think its should be just and Originator to Provider relationship? The same parties can also have a different Originator to Consumer relationship with different terms. Did I understand you correctly there? - @primell

I'm trying to capture what role an organisation plays rather than describe what they actually are.

The trouble with the latter approach is that organisations are often Originators, Providers, and Consumers. Which one just depends on the use case.

Focusing on roles however presents two possibly counter-intuitive results:

  1. A vendor plays the role of Consumer when receiving content
  2. A bank plays the role of Provider when passing on content internally

But what's counter-intuitive one day might appear insightful the next. Compliance becomes quite natural to define in these two cases:

  1. A vendor must ensure that the policies it receives as a Consumer are compliant with those it offers as a Provider
  2. Banks are free to provide content internally under new policies so long as those policies are compliant with those it receives as a Consumer.

Somewhere here there is a paradigm shift. But I think it's a useful one.

benedictws commented 4 years ago

Summarising the discussion:

An Asset is a Resource or Source, a collection of Resources and/or Sources, or the part of a Resource/Source controlled by a Rule. (A Rule being a Permission, a Prohibition, or a Duty).

A Resource is some commodity (e.g. API calls, bandwidth, data) access to which (or to some part of which) may be controlled by a Rule.

Example: In the context of market data, a Resource could be the all the pricing and trading data generated by the trade in one or more financial instruments on an exchange, or indices derived from that data.

Editorial Note: Resources are sensitive to their means of delivery and informational completeness. A real-time version of that data may be a different Resource than the delayed. Book depth too can distinguish between Resources.

Editorial Note: A Resource can be transformed through some calculation but still remain the same Resource. It only becomes a new Resource if the transformation is irreversible (i.e. the original data cannot be recovered) and non-substitutive (i.e. the altered data cannot be used in place of the original).

Example: The operation of currency conversion does not change the identity of a Resource

Example: Augmenting a Resource with additional symbology does not change the identity of that Resource

Example: Co-mingling Resources does not change the identities of the combined Resources

Example: Normalising a Resource does not change the identity of that Resource

A Source is a root Resource: unconstrained and informationally complete, independent of the aspects of commercialisation and/or delivery including timeliness, method of delivery, or book depth.

joshcornejo commented 4 years ago

@benedictws maybe 'transformation' on asset is what makes a resource?

Example:

So the relationships can be:

A transformation has a property with two possible values:

A sample of isomorphic transformations: S ⇔ T1.S = R1 ⇔ T2.1 = R2 .... etc

benedictws commented 4 years ago

Examples of Sources:

Xetra

:S1     a                   md:Source ;
        rdfs:label          "Xetra" ;
        dc:description      "Market data for German and international instruments traded on the Xetra and Frankfurt Stock Exchange" ;
        md:originator       <https://permid.org/1-4298007872> ; # Identifier for DBAG
        md:complexID        [  rdf:type         md:complexID ;
                               md:venue         [   rdf:type        md:Venue ;
                                                    rdfs:label      "Xetra" ;
                                                    md:operatingMic "XETR" ;
                                                    md:mic          "XETR"
                                                ] ;
                               dc:identifier   "Xetra"
                            ] ;
        md:contentNature    md:Dynamic .

The Eurodollar Futures Contract

:S1     a                   md:Source , dcat:Dataset ;
        rdfs:label          "Eurodollar Futures Contract" ;
        md:originator       <https://permid.org/1-4295899615> ; # Identifier for CME
        md:complexID        [   a               md:ComplexID ; 
                                md:context      [   a                       md:Venue ;
                                                    rdfs:label              "Globex" ;
                                                    md:operatingMic         "XCME"^^xsd:string ;        
                                                    md:mic                  "GLBX"^^xsd:string ; 
                                                ] ;
                                dc:identifier   "GE"^^xsd:string
                            ] ;
        md:assetClass           md:Derivatives ;
        md:contentNature        md:Dynamic .
benedictws commented 4 years ago

Examples of Resources and Assets:

End-of-day Eurodollar Futures Contract

:R1     a                       md:Resource , dcat:Dataset ;
        rdfs:label              "End-of-day Eurodollar Futures Contract" ;
        md:provider             <https://permid.org/1-4295899615> ; 
        md:resource             :S1 ;  
        md:assetClass           md:Derivatives ;
        md:contentNature        md:StaticEOD ;
        md:timelinessOfDelivery [   a                   time:ProperInterval , md:Embargoed ;
                                    time:after          [   a                   time:Instant, md:MarketClose ;
                                                            time:inDateTime     [   a               time::DateTimeDescription ; # Monday to Friday?
                                                                                    time:hour       "16"^^xsd:int ;
                                                                                    time::timeZone   <https://www.wikidata.org/wiki/Q2086913>
                                                                                ]
                                                        ]
                                ] .

Xetra Ultra

:A1     rdf:type                   odrl:Asset ;
        rdf:label                  "Xetra Ultra" ;
        md:resource                :S1 ;
        md:timelinessOfDelivery    [  rdf:type             time:ProperInterval , md:Realtime ;
                                      time:intervalBefore [  rdf:type    time:ProperInterval ;
                                                             md:timeReference  time:Instant , md:TimeOfIssue ;
                                                             time:hasXSDDuration "PT15M"^^xsd:duration
                                                          ]
                                   ] ;
        md:depthOfMarket           [  rdf:type             md:Level2 ;
                                      md:positionFrom      1 ;
                                      md:positionTo        10 ;
                                   ] .
benedictws commented 4 years ago

@joshcornejo - I shall check with a friendly mathematician if the concept of isomorphism helps us here! But yes, I think it's what we're after.

joshcornejo commented 4 years ago

Just in case, my proposition is based on isomorphism of manifolds:

As a graph (at its simplest: the mapping between R1 ⇔ R2) can be considered a discrete approximation to a manifold, we can refer to use the property of "isomorphism of manifolds" for our benefit.

benedictws commented 4 years ago

Now captured in the profile: https://w3c.github.io/market-data-odrl-profile/md-odrl-profile.html#Resources