OMG Issue 18518: Using enumerations instead of using code systems

cendle commented 11 years ago

Description:

The PIM uses enumeration rather than having a code system of its own. This does not allow for new codes to be added easily (need another enumeration), a separate documentation is needed for the definition of what the enumerations mean, and translations are not possible. Two such examples are the enumeration Resource Type with 7 possible values such as CODE_SYSTEM, CODE_SYSTEM_VERSION, CONCEPT_DOMAIN, MAP, MAP_VERSION, VALUE_SET, VALUE_SET_DEFINITION and the enumeration CHANGE TYPE with the enumerations: CREATE, UPDATE, METADATA, DELETE, CLONE, IMPORT. It would good terminology practice for the international specifications of terminology server to use an internal code system rather than use enumerations.

http://www.omg.org/issues/cts2-rtf#Issue18518

hsolbrig commented 11 years ago

This crosses into the boundary of a UML model and the CTS2 API. CTS2 makes three different uses of value sets:

1) Concept domains with application specific value sets. The "status" property of CodeSystemVersionCatalogEntry is an example of this, where the specific status values depeend on the particular work flow model for the contained terminology. 2) Concept domains with recommended but not mandated value sets. The CodeSystemCatalogEntry calls for ontologyDomain, ontologyType, designedForOntologyTask, etc. While not mandated in the standard, each of these domains has a recommended value set - in this case, a set drawn from the Ontology Meta Vocabulary (OMV) 3) Concept domains with mandated value sets. While the specification isn't as obvious about this as it should be, the intent is that the language and format fields in the OpaqueData element should be drawn from value sets constructed from the IETF/ISO language and mime codes 4) Concept domains that are coupled with the CTS2 model itself. Examples of this include the "describedResourceType" and "entrySate" properties of ResourceDescription, where "describedResourceType" identifies the particular type of resource (CodeSystemVersionCatalogEntry, ConceptDomain, MapCatalogEntry, etc.).

The first three cases could and, ideally, should be represented by ConceptDomain-ConceptDomainBinding-ValueSet entries in a CTS2 service. As an example, the CTS2 concept domain, "ontologyDomain" could be bound to an OntologyDomain value set derived from the OMV. The fourth case, however, would not lend a lot to the specification, for two reasons. The first reason is that not all of the model elements even appear in a given PIM. The "describedResourceType" element, in particular, is not actually represented in XML Schema, as the XML Element name itself determines which type of resource is represented. The second reason is a CTS2 service or client would have no idea what do do an "entryState" or "describedResourceType" that wasn't already part of the spec. The semantics of ACTIVE and INACTIVE are clearly defined, but what would a client do if it received a status of "PROPOSED" as an entry state? Similarly, adding a new "describedResourceType" would be the equivalent of adding a new model element that is not described in the specification. The behavior of a client (or server) in this case would be undefined.

This does not prevent the specification from being extended using UML, however. One could import the CTS2 Core specification and extend or replace the existing enumerations to represent new or additional model types. We had actually considered separating the change set model from the rest of the specification because there is nothing in there that is terminology specific. We decided against it, however, as it fell outside of the scope of the CTS2 RFP.

In summary, while we definitely need to deploy a CTS2 service that serves the concept domains and value sets that are used by the CTS2 specification itself, we see little to be gained by restructuring the existing CTS2 specification to represent UML model enumerations as CTS2 value sets. Note, however, that there is work underway to define a CTS2 / UML profile that will be able to bridge some of these issues by formally defining the relationship between UML Property Types, Enumeration and Enumeration Literal with CTS2 Concept Domain, Value Set, and Resolved Value Set and URIAndEntityName.

craigstancl commented 11 years ago

Reason For Change

The PIM uses enumeration rather than having a code system of its own. This does not allow for new codes to be added easily (need another enumeration), a separate documentation is needed for the definition of what the enumerations mean, and translations are not possible. Two such examples are the enumeration Resource Type with 7 possible values such as CODE_SYSTEM, CODE_SYSTEM_VERSION, CONCEPT_DOMAIN, MAP, MAP_VERSION, VALUE_SET, VALUE_SET_DEFINITION and the enumeration CHANGE TYPE with the enumerations: CREATE, UPDATE, METADATA, DELETE, CLONE, IMPORT. It would good terminology practice for the international specifications of terminology server to use an internal code system rather than use enumerations.

Severity: Minor

Proposed Resolution (PIM)

NA

Proposed Resolution (PSM)

NA

Discussion

This crosses into the boundary of a UML model and the CTS2 API. CTS2 makes three different uses of value sets:

1) Concept domains with application specific value sets. The "status" property of CodeSystemVersionCatalogEntry is an example of this, where the specific status values depeend on the particular work flow model for the contained terminology. 2) Concept domains with recommended but not mandated value sets. The CodeSystemCatalogEntry calls for ontologyDomain, ontologyType, designedForOntologyTask, etc. While not mandated in the standard, each of these domains has a recommended value set - in this case, a set drawn from the Ontology Meta Vocabulary (OMV) 3) Concept domains with mandated value sets. While the specification isn't as obvious about this as it should be, the intent is that the language and format fields in the OpaqueData element should be drawn from value sets constructed from the IETF/ISO language and mime codes 4) Concept domains that are coupled with the CTS2 model itself. Examples of this include the "describedResourceType" and "entrySate" properties of ResourceDescription, where "describedResourceType" identifies the particular type of resource (CodeSystemVersionCatalogEntry, ConceptDomain, MapCatalogEntry, etc.).

The first three cases could and, ideally, should be represented by ConceptDomain-ConceptDomainBinding-ValueSet entries in a CTS2 service. As an example, the CTS2 concept domain, "ontologyDomain" could be bound to an OntologyDomain value set derived from the OMV. The fourth case, however, would not lend a lot to the specification, for two reasons. The first reason is that not all of the model elements even appear in a given PIM. The "describedResourceType" element, in particular, is not actually represented in XML Schema, as the XML Element name itself determines which type of resource is represented. The second reason is a CTS2 service or client would have no idea what do do an "entryState" or "describedResourceType" that wasn't already part of the spec. The semantics of ACTIVE and INACTIVE are clearly defined, but what would a client do if it received a status of "PROPOSED" as an entry state? Similarly, adding a new "describedResourceType" would be the equivalent of adding a new model element that is not described in the specification. The behavior of a client (or server) in this case would be undefined.

This does not prevent the specification from being extended using UML, however. One could import the CTS2 Core specification and extend or replace the existing enumerations to represent new or additional model types. We had actually considered separating the change set model from the rest of the specification because there is nothing in there that is terminology specific. We decided against it, however, as it fell outside of the scope of the CTS2 RFP.

In summary, while we definitely need to deploy a CTS2 service that serves the concept domains and value sets that are used by the CTS2 specification itself, we see little to be gained by restructuring the existing CTS2 specification to represent UML model enumerations as CTS2 value sets. Note, however, that there is work underway to define a CTS2 / UML profile that will be able to bridge some of these issues by formally defining the relationship between UML Property Types, Enumeration and Enumeration Literal with CTS2 Concept Domain, Value Set, and Resolved Value Set and URIAndEntityName.

It is proposed we address this issue again in a future revision task force.

cts2 / cts2-specification