edmcouncil / fibo

The Financial Industry Business Ontology (FIBO) defines the sets of things that are of interest in financial business applications and the ways that those things can relate to one another. In this way, FIBO can give meaning to any data (e.g., spreadsheets, relational databases, XML documents) that describe the business of finance.
https://spec.edmcouncil.org/fibo/
MIT License
315 stars 67 forks source link

Remote rdfs:subClassOf triples #874

Closed mereolog closed 3 years ago

mereolog commented 4 years ago

Fibo contains a number of remote rdfs:subClassOf triple statements, that i.e., cases where all three statements are present: class1 rdfs:subClassOf class2. class2 rdfs:subClassOf class3. class1 rdfs:subClassOf class3. For example: NationalSecurityIdentificationScheme rdfs:subClassOf SecurityIdentificationScheme SecurityIdentificationScheme rdfs:subClassOf RegistrationScheme NationalSecurityIdentificationScheme rdfs:subClassOf RegistrationScheme

In such cases the last statement can be inferred from the others and as such is not needed.

Although there is nothing formally wrong with such statements, they unnecessarily clutter the ontology by making it more expensive to maintain. They also violate the DRY principle. Finally they affect visualisation as the viewer displays more parents than needed from the formal point of view.

Here is the list of all such cases in the current master: remote_subclassofs_20200212.xlsx They were collected by the following SPARQL query: SELECT ?sub ?super WHERE { ?sub rdfs:subClassOf ?super. ?sub rdfs:subClassOf/rdfs:subClassOf+ ?super.} over all the aggregated RDF graphs over all rdf files in the fibo repo.

Perhaps it would make sense to add this to the hygiene tests - not as a check that throws errors but maybe as a warning check. After all they might be an informal reason for keeping a remote.

dallemang commented 4 years ago

We have a number of other DRY principles we enforce in FIBO using hygiene tests (e.g., don't refer to owl:Thing as a domain or range). We do intentionally violate DRY from time to time, but only if there is some expository reason to do so (I can't actually remember such a case, I just remember that there has been one)

This makes sense to me, and like you say, it is easy to fix.

Do keep in mind that there could be modularity issues; e.g., A subClassOf B could be in one ontology, B subClassOf C in another, but we cannot necessarily expect that second ontology to be imported, so we assert the apparently redundant A subClassOf C to reduce the dependency of the first ontology on the second. I don't know if this really happens, but it could.

rivettp commented 4 years ago

This has exposed a worse problem - a cycle - with https://spec.edmcouncil.org/fibo/ontology/DER/ExchangeTradedDerivatives/ExchangeTradedOptions/TradedOptionPrincipal which is :

Where the latter is:
rivettp commented 4 years ago

Some of these reveal some deeper structural flaws which require issues in their own right . e.g. Organization is a subclass of IndependentAgent (which is rather non-usefully defined as "any person or organization") and the latter is a subclass of AutonomousAgent whose definition seems to exclude Organizations "An agent is an autonomous individual that can adapt to and interact with its environment."

ElisaKendall commented 4 years ago

A handful of these are not really subclasses per se - there are unions involved, and most of the IND issues will go away once we republish the FpML rates. Most of the rest are in provisional ontologies that we have not yet worked our way through.

The exception is the one that Pete points out, something that we had left to the FND group to fix, but I'd like to tackle that one sooner than later.

ElisaKendall commented 4 years ago

Note too that we named "IndependentParty" due to banks that are working with the IBM model, which had Party and PartyRole, but that's again a historical issue that we could rectify now. I'm also concerned about AutonomousAgent, which we could rename to Agent, and simplify the definition a bit.

There are others that Dean points out are due to which ontology they occur in, and whether or not they would violate our modularity requirements. Not many, though.

ElisaKendall commented 4 years ago

The resolution to #877, just addressed, fixes both the cycle and redundant inheritance issues with respect to TradedOptionPrincipal and NovatedOptionContractPrincipal, which were the most problemantic redundancies of those identified by this issue.

ElisaKendall commented 4 years ago

The resolution to #860, just addressed, eliminates most of the IND issues, whereby the FpML rates were changed from classes to individuals. Some of those remaining are due to changes that have been made over time that impacted the hierarchy or simplified it in some cases, at least for those in released ontologies.

rivettp commented 4 years ago

Good work, we're now down to 22. I added an extra column in [attached ](Issue-874.xlsx for the (most) intermediate superclass.

rivettp commented 4 years ago

It just occurred to me that this is akin to ontology imports. If A imports B and B imports C then A implicitly imports C. In this case it would be technically redundant to explicitly import C but that's precisely what our policies say you should do - to make the dependency clear and explicit. So we're not DRY for imports, and for good reasons IMO. Bottom line, we should not be too hasty in removing the redundant subclassOf triples and consider whether they might serve a purpose. I don't have a firm answer to that right now but it's worthy of discussion. To take the first example in the new Excel file I could see an argument for explicitly stating that RegisteredAddress is a PhysicalAddress as well as a RegistrationAddress.

ElisaKendall commented 4 years ago

As of the Q1 2020 release, we have 12 remaining in FIBO release, and an additional 4 in LCC for a total of 16. Most of the FIBO-specific ones can be resolved in Q2, with a few exceptions possible exceptions that may be due to the challenge Pete cites above, i.e., a subclass relationship added in a dependent ontology to further refine some concept in an ontology it imports.

rivettp commented 4 years ago

I'd appreciate some input on my previous comment - that this is consistent with our policy on owl:imports (where we encourage redundancy) and might not necessarily be a bad thing.

VladimirAlexiev commented 4 years ago

@rivettp I think redundant subClassOf is a bad thing, and should be considered separately from redundant owl:import.

A subClassOf B could be in one ontology, B subClassOf C in another, but we cannot necessarily expect that second ontology to be imported, so we assert the apparently redundant A subClassOf C to reduce the dependency of the first ontology on the second.

But doesn't this expose rather than solve a modularity issue? It seems to me that in this case ontology1 depends on ontology2 (because of the first subclass statement) and ontology2 depends on ontology1 (because of the second subclass statement).

Have owl:import dependencies been analyzed for cycles?

mereolog commented 3 years ago

@VladimirAlexiev Yes, they were - see: https://github.com/edmcouncil/fibo/issues/1152.

rivettp commented 3 years ago

@VladimirAlexiev I don't think that C would be in the same ontology as A so it would not be the case that "ontology2 depends on ontology1 (because of the second subclass statement)."

VladimirAlexiev commented 3 years ago

Guess I got confused. Maybe because you don't state in which ontologies A, B and C are declared.

If A subClassOf B, then shouldn't ontology1 import the ontology where B is declared?

Presumably B is declared in the same ontology that has B subClassOf C...