ISO-TC211 / 19115-2Revision

Repository for sharing documents, models, and schemas for the revision of ISO 19115-2 Metadata extension for Imagery and gridded data
3 stars 0 forks source link

Removal of MI prefix? #19

Open DaveDanko opened 9 years ago

DaveDanko commented 9 years ago

I'm starting to update the organization of the document and using the MI prefix seems to not make sense anymore - and the HMMG recommends not using them on new work. Is it ok if I drop the MI/ME prefixes on the class names? (they are still used in 19115-1)

cdobrien commented 9 years ago

The HMMG says that the prefixes are not necessary because every object is unique in its package. The path of packages within packages describes the name space.

I am afraid that this will create objects with the same that can only be distinguished by tracing back the path. I think it is much clearer to have prefixes on objects. They establish a global uniqueness of names.

I think this needs to be raised again in the HMMG. What is the benefit of dropping the prefixes and potentially having duplicate names that the model knows are different (from the path), but that people mistakenly think are the same.

I think there is a place for global names and the prefix creates global names.

If prefixes are used in 19115-1 I think they should also be used in 19115-2.

Doug

jetgeo commented 9 years ago

One benefit of dropping the prefixes is that we can have names that are closer to real life. The reason for introducing the prefixes in the first place, was that the old UML editors couldn't handle identical names in different packages.

In the revision of ISO19107, the prefixes (“GM” for geometry and “TP” for topology) are removed. qute from John Herring: _"...In the original, the name of the concept was distinct from the name of the implementation class of that concept. The concept Curve was implemented by GM_Curve, and might have an implementation called (for example in SQL/MM) STCurve. Under the new regime, the three logically identical things can be called “Curve” and be distinguished only by their “namespace." The effect of this change is that the semantic gap between the descriptive names of the text and the names in the Model is eliminated. "

I can agree with Doug's comment to a certain point. As long as we have the prefixes in 19115-1, it would be strange to not have them in 19115-2. Instead, they can all be removed in a future revision that hopefully will merge the two parts into one, and also include Coverage Result in Data quality. But for this revision, we should have prefixes in 19115-2 as well as in 19115-1.

From what I can see, there are four prefixes in the existing 19115-2: MI, LE, MX and QE. I guess "MI" means "metadata for imagery", "LE" means "Lineage extentions", "QE" means "Quality extentions". But what does "MX" mean? And do we really need this many prefixes for just a few classes? Can we use the same prefixes as in 19115-1 and 19157 (MD, LI and DQ)?

cdobrien commented 9 years ago

I agree with Knut to leave the prefixes in 19115-2 for consistency with 19115-1.

I only partially agree with John Herring on the elimination of the prefixes because the context is established by the namespace. If in the example John gave of GM_Curve in 19107 and ST_Curve in SQL/MM being identical then the concept of Curve is global so Curve can be considered as a global name. But this is not always the case. If a second standard or product specification specializes a class it will create a new object. Someone may call this SpecialCurve in their standard. Somebody else in a different standard will create a different specialization but might also use the name SpecialCurve. These two different special curves are distinct because the name spaces are distinct, but it may be very difficult for others to follow. Someone in a fourth application schema may want to inherit from both of the standards the specialized curve in different ways.

Some of the older UML software just made all names global, so we needed to use the prefixes to manage class names; however, package name space3s do not eliminate the responsibility for namespace management. By eliminating all the prefixes we could create a lot of confusion.

For some of the basic standards like 19107 it makes sense, but it also makes sense to have distinct names that are not misleading. If standard 191xx were to specialize Curve it would make sense to have the name of the new class as XX_Curve.

Also there are now thousands of classes in the model. The prefixes let one have an idea of where the class came from.

The HMMG will need to establish some rules for name management that do not lead to confusion.

hjelmager commented 9 years ago

I agree with Knut and Doug regarding leaving the prefixes for consistency reason.

I think that Doug have a very valuable point that the prefixes let one have a idea where the classes came from. This information is of great value when reading the diagram manually. Without this information there will too many diagrams to go through in order to find the source diagram (and standard). So as Doug mention there is a need for some rules for name management.

jetgeo commented 9 years ago

Let's keep this as a agenda item at the HMMG meeting in Sydney. I have also referenced this issue from a related issue in the UML Best practices work: https://github.com/ISO-TC211/UML-Best-Practices/issues/6

DaveDanko commented 9 years ago

We can keep them to be consistent with 15-1. But you are going to get some kick back, EA and other modeling software keeps track of and identifies where the class/package comes from i.e Data quality:DQ_DataQuality so you don't even need to know what DQ stands for. Since ISO TC 211 standards are showing up without them I am/have removing(ed) the table in the document that identifies/defines all the prefixes used. Should we still use MI (metadata for imagery) even though our scope has changed? Presently MI, as in MI_Metadata, is used to show that it is an extended MD_Metadata class. I could use MD_MetadataExtended; LI_ProcessingStepExtended vs LE_ProcessingStep; or MD_OriginalClassEx? Which brings me to the point that I hate having to duplicate/extend all these classes, if I wasn't so stupid (or senile) I should have thought to add these associations and/or attributes to the original classes in the revision of 19115 and 19115-2 would just have the new classes/concepts without having to do extensions (like we have done with data quality and feature catalogue). We tried and were turned down to incorporate the whole 19115-2 into 19115-1 which would have been the least confusing/cumbersome/legitimate thing to do.

jetgeo commented 9 years ago

I agree with everything Dave have written above. To bad we didn't merge part 2 into part 1, and we do get some duplicate information by using prefixes. But lets keep them in this version. I would like to skip the MI, LE and QE prefixes, and use the same prefixes as in the fundamental standards: MD, LI and DQ. This approach will keep the two parts closer together, and prepare for merging them in the future. If the class in part 2 extends an existing class with the same name (i.e. LI_ProcessingStep, MD_Metdata), it will be an acceptable solution to add "Extended" as suffix (LI_ProcessingStepExtended and MD_MetadataExtended).

cdobrien commented 9 years ago

We have a huge problem looming in TC211. Once a model is published we can't ever change it. There may be some product specifications that make use of that model. Some of the application schema even end-up in legal documents. Everything that is published lasts forever and nothing that is published can ever change. Revisions can not change classes.

The only way to handle this in UML is to create a new package and duplicate all of the classes. This will mean that there will be a 19115:2003 package and a 19115-1:2014 package. Therefore there will be many classes with exactly the same name but different packages.

The problem come with respect to a third schema that references classes in 19115-1:2014 and also classes in some other standard that has not been revised that still references 19115:2003. We end-up with two versions of the same class in the third schema. This can create a real mess. The only way to "solve" this would be to version the entire set of TC211 standards and revise them all at once. We are not doing this.

This is completely independent of using the prefixes, but the prefixes greatly help in determining when one has duplicate conflicting classes.

In a presentation to the HMMG at the last meeting in the UK I presented a paper that indicated that we should use the versioning tags available in EA and also use the "trace" relationship available in EA to show the history.

Because GitHub does not support attaching a file to a GitHub post (only images), all I can do is to provide a pointer to the file on LiveLink. < http://isotc.iso.org/livelink/livelink?func=ll&objId=17295511&objAction=Open >

Here is one picture showing trace between classes in IHO S-100 between version 2 and version 1. In the IHO S-100 (proposed V2) all the classes are duplicated in a new V2 set of packages and a trace relationship is established between the new version of the class and the old version.

trace

There really need to be policies on how to name and identify parentage of classes in the harmonized model.

Doug

kateroberts commented 9 years ago

Hi Dave:

WT para.3: Is there any chance that we could still revise just 19115-1 to have those hooks? As you say, it would be much neater. Seems like good timing – before everyone (especially metadata creators) invest too much into implementing 19115-1 as is. (Software developers (of edittiing tools, transforms, etc.. ) could factor in the proposed changes fairly easily, even if the ISO wheels took longer for it to be ratified).

Kate

From: Dave Danko [mailto:notifications@github.com] Sent: Thursday, 20 August 2015 5:40 AM To: ISO-TC211/19115-2Revision Subject: Re: [19115-2Revision] Removal of MI prefix? (#19)

We can keep them to be consistent with 15-1. But you are going to get some kick back, EA and other modeling software keeps track of and identifies where the class/package comes from i.e Data quality:DQ_DataQuality so you don't even need to know what DQ stands for. Since ISO TC 211 standards are showing up without them I am/have removing(ed) the table in the document that identifies/defines all the prefixes used. Should we still use MI (metadata for imagery) even though our scope has changed? Presently MI, as in MI_Metadata, is used to show that it is an extended MD_Metadata class. I could use MD_MetadataExtended; LI_ProcessingStepExtended vs LE_ProcessingStep; or MD_OriginalClassEx? Which brings me to the point that I hate having to duplicate/extend all these classes, if I wasn't so stupid (or senile) I should have thought to add these associations and/or attributes to the original classes in the revision of 19115 and 19115-2 would just have the new classes/concepts without having to do extensions (like we have done with data quality and feature catalogue). We tried and were turned down to incorporate the whole 19115-2 into 19115-1 which would have been the least confusing/cumbersome/legitimate thing to do.

— Reply to this email directly or view it on GitHubhttps://github.com/ISO-TC211/19115-2Revision/issues/19#issuecomment-132754302.

DaveDanko commented 9 years ago

I would love to be able to change 19115-1 but I don't think ISO would let us do an amendment so soon especially if it is for convenience and not fixing a fatal error. I think we'll just start as Knut suggests adding an "Extended" at the end of the class name and use the original 19115:2003 prefixes. Unless/until the HMMG comes up with an alternative.

joanma747 commented 9 years ago

“Extended“ is a very generic word that interferes with any other efforts to extend MD_Metadata. Recently have developed an extension for preservation and an extension for user feedback. If in the future I want to extend 19115-2, this will result in a MD_MetadataExtendedExtended. I do not like this.

IMHO prefixes work reasonable well. In my user feedback extension I have a GUF_Metadata and in preservation I have a MP_Metadata. So others can change the prefix and extent either MD_Metadata or MI_Metadata.

One possible why could be to use a 3 letter prefix such as MD2_Metadata (in reference to the “-2”).

Joan Masó

De: Dave Danko [mailto:notifications@github.com] Enviado el: jueves, 20 de agosto de 2015 5:08 Para: ISO-TC211/19115-2Revision Asunto: Re: [19115-2Revision] Removal of MI prefix? (#19)

I would love to be able to change 19115-1 but I don't think ISO would let us do an amendment so soon especially if it is for convenience and not fixing a fatal error. I think we'll just start as Knut suggests adding an "Extended" at the end of the class name and use the original 19115:2003 prefixes. Unless/until the HMMG comes up with an alternative.

— Reply to this email directly or view it on GitHub https://github.com/ISO-TC211/19115-2Revision/issues/19#issuecomment-132870834 . https://github.com/notifications/beacon/AE_PbFDFfCaTZweY8SW1EMDWFae_aqKFks5opTwagaJpZM4FtFnc.gif

smrgeoinfo commented 9 years ago

I think its important to start thinking of the namespace/element name string as a URI for a information interchange concept. In the metadata realm, we have the concept of 'Metadata' and it is scoped to different contexts by the namespace prefix. In Joan's example, insteat of GUF_metadata and GP_Metadata, maybe we could use MetadatawWithFeedback and MetadataForPreservation, wouldn't that be clearer for users.

jetgeo commented 9 years ago

Extended is a generic word, and we are creating generic standards :). The names of extensions on top of our fundamental standards is not our concern, but I agree with Steve on the suggested names for Joan's examples. Ref ISO19103: "Names of UML elements should use precise and understandable technical names for classes, attributes, operations and parameters." We must also remember that the classes and their names shall be identified and located both by humans and machines. For machines, it is easy to locate the namespace and thereby uniquely identify the class. But for humans, using a understandable name and sometimes also a prefix is more important. I still think that prefixes is something we will move away from in the future, but for this version of 19115-2, we shall keep them and harmonize with the prefixes in 19115-1 and 19157. Regarding Doug's post: This discussion belongs in the UML Best practices work, see this issue: https://github.com/ISO-TC211/UML-Best-Practices/issues/26

hjelmager commented 9 years ago

I agree with Knut. I will even say that the human understanding is crucial because otherwise we cannot ensure a proper use of the standards