ogcscotts / TC-Meeting-topics

place to discuss topics raised by Working Groups
10 stars 0 forks source link

Ontologies are presently central to representing domain knowledge. Should UML still be the authoritative conceptual model language? #53

Open ogcscotts opened 6 years ago

thomashkolbe commented 6 years ago

Most conceptual models and application schemas like CityGML, IndoorGML, INSPIRE data themes are ontologies. The conceptual data models and the definition of the semantics of the different feature types, attribute types, and interrelationship types are a formalization of real world concepts and entities. According to the pertinent definitions of the term 'ontology' in computer science e.g. CityGML defines an ontology for the representation of virtual 3D city and landscape models (c.f. Gruber 1995). The formal model consists of the UML class diagram and the definitions and explanations of the feature types, their attributes, relationships, and constraints given in the CityGML specification document.

Ontologies can not only be formally expressed in the Ontology Web Language (OWL) issued by the World Wide Web Consortium (W3C). You can also use the Unified Modelling Language (UML) issued by the Object Management Group (OMG) or other frameworks. Concepts, properties, and relations can be formally specified by UML class and instance diagrams – if necessary in conjunction with the Object Constraint Language (OCL). I recommend to read the paper of (Atkinson et al. 2006, link is given below) for a detailed discussion (also about closed world versus open world assumptions). While in fact some of the OWL concepts are missing in UML, we don't need them for the specification of the CityGML ontology for virtual 3D city and landscape models. Using UML class diagrams according to the ISO 19100 UML profile also has the advantage, that there exist a lot of software tools which can directly map such a conceptual data model onto data models or relational schemas used in geoinformation systems or spatial database management systems.

Since we typically only require a subset of OWL's representational capabilities, the transformation of a conceptual model specified as a UML class diagram or the derived XML schema to OWL is straight forward. In fact, this has been demonstrated already for CityGML by Gilles Falquet from the University of Geneva in the context of the EU COST Action TU 0801. See http://cui.unige.ch/isi/icle-wiki/ontologies

I strongly suggest not to switch from using data models to ontologies (which in fact are just two technology spaces in the field of knowledge representation), because UML class diagrams are rather easy to understand and more compact in their visual appearance than ontologies specified in OWL (and their visualizations).

Atkinson, C., Gutheil, M., Kiko, K., 2006: On the Relationship of Ontologies and Models. WoMM, 96, 47-60. http://cs.emis.de/LNI/Proceedings/Proceedings96/GI-Proceedings-96-3.pdf

geomancer2012 commented 6 years ago

UML models are strict, object models, usually non-dynamic which are fine for application schemata, but fall short on some more flexible data languages (e.g. JSON). Some of these languages are essentially key-value pairs, and hence only require a Taxonomy. Adding constraints to reflect reality would be supported by an Ontology, OWL would work, and taxonomy could get along with RDF or even simpler languages.

Other, even simpler languages would work, and add flexibility to applications. See OGC 17-087, which is being readied for its public comment period. It suggest a spectrum of taxonomy-ontology-schema approach which support a full spectrum of data approaches between "big data" and "relational and object" database schemata.

Simplicity can be nice, but it limits implementation flexibility. What Einstein actually said about simplicity gets to the point about simplicity, its use limited by the need to work: "without having to surrender the adequate representation of a single datum or experience".

Full Quote: It can scarcely be denied that the supreme goal of all theory is to make the irreducible basic elements as simple and as few as possible without having to surrender the adequate representation of a single datum of experience. ― A. Einstein, "On the Method of Theoretical Physics" address delivered at Oxford, 10 June 1933

pebau commented 6 years ago

Definitely UML. It remains being the industry standard - tons of UML data available, manifold tools, students learn it. While OWL is conceptually nice in its minimality of core concepts it is not at the right granularity for humans to maintain overview. Additionally, switching to OWL would constitute a major effort - we would need to live with two representations, one for legacy and one for new models. And until everybody is up to speed, spec work gets seriously delayed. The resource of voluntary hours of all the OGC idealists should be handled with extreme care. PS: on the Einstein quote, you surely will notice that he was talking to Theoretical Physicists, not to industry nor society at large :)

cportele commented 6 years ago

I do not understand the implicit "either UML or OWL" in the question. Both are capable of capturing knowledge. Why not allow the use of UML and/or OWL (and/or SKOS or similar) depending on what works best depending on the context? I think this is the current practice in OGC and I do not see why this has to change?

PeterParslow commented 5 years ago

Perhaps the real question is what is the best modelling / knowledge representation language to use in the 'open world' paradigm e.g. semi-structured data stores (where every 'feature instance' can have its own schema). As John says, these require a taxonomy. It seems to me that SKOS & OWL sit more at the "XSD" level: machine readable artefacts good for processing / checking data against a taxonomy (to the extent that the open world paradigm allows!) - but not as good as UML class diagrams for conveying that to humans. (And not that many humans like UML class diagrams either).

This is the challenge I see in Features & Geometries Part 1, and I think it needs wider discussion than "just" the Simple Features SWG.

But like Clemens & Thomas, I don't see a reason for a general change of approach.

I would pose a different question: "what is OGC's chosen conceptual modelling approach for non-relational data structures?" or some variant on that.