w3c-ccg / traceability-vocab

A traceability vocabulary for describing relevant Verifiable Credentials and their contents.
https://w3id.org/traceability
Other
35 stars 35 forks source link

use inheritance not aggregation #277

Closed VladimirAlexiev closed 10 months ago

VladimirAlexiev commented 2 years ago

Speaking from an ontological or object-oriented point of view, AgProduct.yml should be a subclass of Product.yml, rather than pointing to it (using prop product).

Currently you have fields at two different levels, eg AgProduct.gtin but AgProduct.product.sku: can you provide a rationale for that?


This is just an example. I'm sure that a closer examination will show many other cases of inheritance (specialization) that are improperly represented as aggregation.

VladimirAlexiev commented 2 years ago

@OR13 @nissimsan

Say you have a class AquacultureProduct that wants to reuse props from 2 others:

That's multiple inheritance, which is perfectly ok in RDF. You should be able to use the props directly: AquacultureProduct.plotID, AquacultureProduct.vesselID

With the aggregation approach, what would you do? AquacultureProduct.product1.plotID, AquacultureProduct.product2.vesselID ? Makes no sense.

Even if you use more meaningful props for the aggregation; AquacultureProduct.agProduct.plotID, AquacultureProduct.marineProduct.vesselID it still makes little sense.

OR13 commented 2 years ago

we build the context from JSON Schema. but we don't yet have the ability to handle allOf and anyOf... You can solve this problem by creating a new type for AquacultureProduct and explicitly setting its JSON-LD properties.

Great example btw.

VladimirAlexiev commented 2 years ago

For a schema of that size you need inheritance. I think you need to build it in your tools.

Rather than AgProduct ... include Product.yml you need some sort of declaration AgProduct inherits Product, which would mix all Product fields into AgProduct. Does JSON schema have such capability?

OR13 commented 2 years ago

Does JSON schema have such capability?

it does, the problem is that the "context compiler" is not yet aware of how to build nested contexts that respect inheritance.

Which is a major reason why you see so many type definitions that are very close, with repeated definitions or properties.

VladimirAlexiev commented 2 years ago

@OR13 gimme a link please.

I'm still learning JSON Schema, eg just learned it's evolving and there are 4 drafts in use (thought it's like FOAF, stuck in 0.1 forever)

OR13 commented 2 years ago

https://github.com/transmute-industries/verifiable-data/blob/main/packages/jsonld-schema/src/schemasToContext.ts

OR13 commented 2 years ago

Before we keep hacking at this, it think we should pause and take a look at what LinkML will give us out of the box.

mkhraisha commented 2 years ago

This is something that I found quite difficult to reconcile and work with in general, as Orie noted though theres just a huge number of issues right now, but it would absolutely by ideal if we can do what you're describing

VladimirAlexiev commented 2 years ago

@OR13 Sorry, I meant where the JSON Schema spec defines inheritance?

OR13 commented 2 years ago

Open API Spec v3: https://swagger.io/specification/

JSON Schema: https://json-schema.org/understanding-json-schema/reference/combining.html

(allOf / anyOf) relate to interitance)

VladimirAlexiev commented 2 years ago

So what should one do to implement a multi-parent hierarchy C1<C3, C2<C3, C3<C4?

C1: allOf [C1_own_fields]
C2: allOf [C2_own_fields]
C3: allOf [C1, C2, C3_own_fields]
C4: allOf [C3, C4_own_fields]

Note: GraphQL also doesn't have inheritance but has Interfaces. So to implement inheritance, you need to collect all fields from all interfaces implemented by a class, and copy them to the class definition.

OR13 commented 2 years ago

yes, typescript also has interface merging: https://www.typescriptlang.org/docs/handbook/declaration-merging.html

I am not sure if you are asking this, but I don't think JSON Schema supports multiple inheritance... very few languages do.

I think generally for data modeling, you have to balance design principles, sometimes DRY is less important than readability, or audit-ability.... In the blockchain / smart contract space.... inheritance has been the source of many extremely painful bug bounties, since it fragments security concerns across many separate files.

That being said, I think there are cases where inheritance can help or is the most natural approach.

The current solution we have is bound by JSON Schema... which may be limiting in ways that feel painful at first... I am not sure if that pain is actually a helpful reduction of complexity, or a harmful reduction of applicability.

I think this is why we need to keep our minds open to things like LinkML, etc... but we also have to be careful not to take a dependency that will prevent effective contribution / collaboration.

inheritance can create such scenarios, especially with more junior contributors.

VladimirAlexiev commented 2 years ago

allOf is an array, so any number of parents should work. I'm talking data not method inheritance , so I'm not sure what security problems that could bring. How is AgProduct.product.weight more secure than AgProduct.weight ?

OR13 commented 2 years ago

-1 to complex inheritance... we should strive for simple JSON Schema types as RDF Classes, complexity harms usability.

BenjaminMoe commented 2 years ago

Agree with @OR13. Our focus should be on simple JSON schema types.

VladimirAlexiev commented 1 year ago

@OR13 @BenjaminMoe I think the overriding concern is that it should be simple for data producers and consumers, i.e. the data instances should be as simple as possible. Following my example, it's clear which one is simpler:

Aggregation forces the user to have extra sub-objects (sub-nodes), inheritance does not.

@OR13 you talk of "DRY", claim that inheritance makes things more complex, and claim that inheritance causes "painful bug bounties". Please give some examples to substantiate those claims.

If JSON Schema implements inheritance through allOf and that's an array, then multiple inheritance is not any more complex than single inheritance.

brownoxford commented 1 year ago

@VladimirAlexiev we would like to move this towards an actionable conclusion - can you please put forth a proposal?

OR13 commented 1 year ago

We're not currently planning on changing the schema processing logic to make it aware of inheritance.

I suggest we reformulate the ask in the shape of a feature request of the jsonld-schema package, which we are using internally here.

TallTed commented 1 year ago

I think that what's being called "inheritance" here is what (in the RDF/JSON-LD world) is handled as {sub,super}-classes and/or {sub,super}-properties. Requires RDF inference/reasoning somewhere in the tool chain. That probably makes more sense as a path forward...

nissimsan commented 1 year ago

We're missing the skills to progress this. Marking pending close.