oslc-op / oslc-specs

OSLC OP specifications and notes
https://open-services.net/specifications/
25 stars 10 forks source link

Should we deprecate the "inverse" properties defined in RM 2.1? #82

Closed jamsden closed 5 years ago

jamsden commented 5 years ago

OSLC RM 2.0 defines a number of potentially redundant, inverse properties (e.g., satisfiedBy, satisfies). This is not recommended practice and creates a lot of potential confusion when configuration management and property ownership have to be considered. Should we deprecate these potentially redundant inverse properties in the OSLC RM 2.1 spec?

If so, which ones?

berezovskyi commented 5 years ago

Inverse properties are a recommended way to compute backlinks with OWL. I am against their removal.

Having said that, I suggest:

I think satisfies would be a primary link here, satisfiedBy is the secondary (satisfiedBy owl:InverseProperty satisfies and has its oslc:Property be oslc:readOnly true .).

jamsden commented 5 years ago

OSLC link guidance I believe has evolved to recommend no inverse properties, and no inferencing. So owl:InverseProperty would not be recommended. See file:///Users/jamsden/Developer/OSLC/oslc-op/oslc-specs/notes/link-guidance.html#Link.

The problem is when configuration management is introduced backlinks (i.e. inverses) cannot be specified and managed consistently across baselines. I thought we discussed a property shape what could define the recommended inverse label for a property. But I don't see that in the ResourceShapes spec.

berezovskyi commented 5 years ago

Yes, I remember the argument @ndjc made. That said, I think bidirectional links are crucial for traceability. And owl:InverseProperty is the recommended way to produce inverse links for Semantic Web and Linked Data applications (and I support that and disagree with the link guidance).

Maybe in the CfgM we shall consider alternative approaches.

@axelreichwein @jadelkhoury your thoughts?

jamsden commented 5 years ago

SPARQL makes querying a link-type (property) in either direction trivial, just reverse the variable on the end of the triple pattern. There's never a need to store it twice. Tools that need to navigate in both directions should query for incoming links and display them as they wish. This is why we discussed having a suggested name for the "inverse" name in the property shape - so tools could know what to display.

However, I think its a mistake for tools do display incoming links as if the properties actually existed. This is confusing because when you do a GET on the resource, you don't see these made-up links. Also traceability impact depends on the direction of the actual manifest/stored link, and this is lost of the tools make it appear that the backlinks actually exist.

IBM tools have a lot of experience with backlinks, none of it good. We should really not go there.

berezovskyi commented 5 years ago

SPARQL makes querying a link-type (property) in either direction trivial, just reverse the variable on the end of the triple pattern. There's never a need to store it twice.

I think we are running in circles here. We discussed this before and agreed that this is only true if all your data is in a single dataset (ignoring flaky support for federated queries, Linked Data Fragments might very well change that). If your data is split across tens/hundreds of OSLC Servers, you need to be able to follow links.

However, I think its a mistake for tools do display incoming links as if the properties actually existed. This is confusing because when you do a GET on the resource, you don't see these made-up links.

I am actually proposing to return them in a GET response, see the point above about following the links as opposed to querying anything.

IBM tools have a lot of experience with backlinks, none of it good. We should really not go there.

Would love hear about it. Because TRSing all data into a single triplestore is definitely not the answer for the cloud architectures today.

img commented 5 years ago

the assertion "s p o" need not be in the same named graph as other assertions about s. For example, the representation of o may include the triple "s p o". (And so could some other resource representation.)

Some deemed this shredding of the link assertions too complicated and so an approach was followed whereby for each p that represented a trace link, there was a dp (dual to p) so that one could assert "o dp s" rather than "s p o".

At no point was "p owl:inverseOf dp" formally declared, although that was the tacit assumption.

img commented 5 years ago

However, I think its a mistake for tools do display incoming links as if the properties actually existed. This is confusing because when you do a GET on the resource, you don't see these made-up links.

I am actually proposing to return them in a GET response, see the point above about following the links as opposed to querying anything.

IBM tools have a lot of experience with backlinks, none of it good. We should really not go there.

Would love hear about it. Because TRSing all data into a single triplestore is definitely not the answer for the cloud architectures today.

Where is this proposal, @berezovskyi? If a GET on a resource MUST return a representation that includes all incident trace links, generating the representation of a resource is a distributed affair. That has implications for completeness and performance. There was a proposal in the RM workgroup (now lost and mostly forgotton) that offered a "link discovery" resource / Prefer header to determine if incident links were wanted.

jamsden commented 5 years ago

Unfortunately there's no free lunch here. If you want distributed data, but also configuration management (which has issues with redundant data), performance, ease of use, solid data integrity: you need efficient distributed query. This lets you store the links in one place, and broadcast a wide query to get (available) incoming links.

Distributed query is of course very hard to implement, cannot be reliable and would introduce significant performance issues.

Integrated data lakes fed with TRS providers is another solution that eliminates the distributed query, but has other data integrity problems because of skipped resources.

If Andrew has a better solution in mind, I'm all ears.

jamsden commented 5 years ago

But the pressing issue here is there is an inconsistency between the RM 2.0 specification and the RM 2.0 published vocabulary on open-services.net. The RM 2.0 spec has a number of these "dual properties", but the RM 2.0 RDF vocabulary doesn't. The question is: what do we do with these dual properties in RM 2.1? Remove them? Add them to the vocabulary but mark them as deprecated? Just add them to the vocabulary?

Note that just adding them to the vocabulary does not follow OSLC recommended link guidance which recommends against having redundant backlinks.

jamsden commented 5 years ago

From Ian: The issue is that neither of these terms is a “backlink” – they are dual or inverse to each other.

IBM apps use implementedBy, validatedBy etc as backlinks, so those are the ones that IBM already considers deprecated – we don’t use them.

So the ones to be deprecated, based on my experience anyway, are implementedBy – dual is in CM validatedBy – dual is in QM satisfiedBy – dual is in RM decomposedBy dual is in RM elaboratedBy dual is in RM specifiedBy dual is in RM trackedBy dual is in CM, I think. affectedBy dual is in CM, I think.

We’d want to avoid deprecating a term has doesn’t have an dual/inverse defined in one of the OSLC domains.

berezovskyi commented 5 years ago

IBM apps use implementedBy, validatedBy etc as backlinks, so those are the ones that IBM already considers deprecated –

Let's start then by deprecating those. Could you make a PR, please? @ndjc do we have a wiki somewhere how to deprecate properties?

jamsden commented 5 years ago

I would like to think about and discuss this some more. The fact that we define these "inverse" properties doesn't mean any server has to support them. Servers could choose to implement either property, but following the link guidance, not both.

Consider three different vendors providing RM and QM tools. Vendor A uses traditional waterfall development, starting with requirements and then developing the test cases. They might choose to use requirement validatedBy test case links, and their implementations could store the link in the RM server (the server managing the link source resource).

Vendor B could use a more test-driven development approach, starting with test cases, and then summarizing the requirements they validate. They might prefer testCase validatesRequirement requirement and store the link in the QM server.

Vendor C could support both, using a mediation pattern to store whatever link the user provided in a separate link manager.

So maybe having both of these properties isn't really inconsistent with link guidance.

berezovskyi commented 5 years ago

No deprecation will take place because 0..* cardinality is non-binding as Jim says.

Actual work is happening in #119