Closed InKyungChoi closed 1 year ago
InKyung -
As far as I can tell a Category does not have a Designation associated with it. I was one of the designers of the Concepts package, and we tried to differentiate Category Set, Code List, and Classification Scheme by what was associated with each Node (or entry) in each kind of set. I don't see any contradictions.
The definitions of Category, Category Set, and Category Item probably need some editing. Yes, a Category Set should be defined as a set of Categories. A Category Item is a kind of Node, gathering all the information needed to describe a Category. The problem is a Category Set in GSIM is a kind of Node Set. Therefore, the disjunction between Category and Category Item arises. Maybe a better approach is to define Category Set as a set of Category Items. A Category Item is a kind of Node that contains a Category and some other information. A Category is defined in GSIM by "A Concept whose role is to extensionally define and measure a characteristic". Other than to violate the rule about starting a definition with a leading article, this is probably good. In this definition, a Variable takes the role of a Characteristic. Take a variable called Marital Status. We could measure it through either of the 2 Category Sets {single, married} or {single, married, divorced, widowed}. These aren't the only possibilities. But note, single needs to be defined differently in these 2 cases. With these careful definitions, each set of categories defines Marital Status extensionally, meaning these are subordinate concepts whose totality (the whole set of categories) defines the superordinate - Marital Status.
The use of the word "sign" in the definition of Designation in GSIM is confusing, and it needs to be replaced. The word "signifier" should be used instead. There is a general model of designation that says a Sign is the representation of a Signified by a Signifier which denotes it. In the case of concepts and designations, the designation is a Sign, the concept is a Signified, and alphanumeric strings are Signifiers. Other kinds of signifiers can exist, too. A particular string, an example of a Signifier, is called a Token instantiating that Signifier.
I know, that was a mouthful.
Datum as defined by Farance and Gillman (see DDI-CDI documentation) is a designation of a value, where a value is a concept with a notion of equality defined. That is very technical and not at all transparent unless you really know ISO 704 and ISO/IEC 11404. However, we have simplified this definition in several places. One is the Metadata Glossary. The draft DDI Glossary is another. As long you think of a datum as the representation some underlying meaning (i.e., a designation) with a computational model defined (a datatype), you have the idea.
For defining Level, I like the definition based on the number of arcs from the root. It assumes levels are used and only make sense in hierarchies. I think that's right. And each level contains all the Categories the same distance from the root. There could be exceptions to this, though. Alternately, each level comprises concepts (Categories in this case) that form an extensional definition of the root concept. Industry or Economic Activity is defined by all the Sectors, or all the major industry groups, or all the minor industry groups, etc. Each is a level.
Comments about Category from Ayman:
Ayman,
Your comment is constructive. We are probably better off saying the definition of category as you state. However, the terminological principles from ISO 704:2000 was used all over GSIM, and the definition of “extensional definition” includes the ideas you express in “(in a system for dividing things according to appearance, quality, etc)”. The term the creators of ISO 704 used was dimension. All the subordinate concepts used in an extensional definition come from the same dimension.
I don’t know if that clarifies.
Yours Dan
From: InKyungChoi @.> Sent: Friday, February 17, 2023 4:29 PM To: UNECE/GSIMRevision @.> Cc: Gillman, Daniel - BLS @.>; Comment @.> Subject: Re: [UNECE/GSIMRevision] GSIM Concept Group definition / explanatory text update #1 (Issue #36)
CAUTION: This email originated from outside of BLS. DO NOT click (select) links or open attachments unless you recognize the sender and know the content is safe. Please report suspicious emails through the “Phish Alert Report” button on your email toolbar.
Comments about Category https://docs.google.com/document/d/19ENiGK_y9BYaGFhxfNoauY69XMq0EIxK/edit#heading=h.30j0zll from Ayman:
— Reply to this email directly, view it on GitHubhttps://github.com/UNECE/GSIMRevision/issues/36#issuecomment-1435289584, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAIVKL45TOOCHNBHLMCH7K3WX7URJANCNFSM6AAAAAAT5RHRHQ. You are receiving this because you commented.Message ID: @.**@.>>
Discussion on Level through emails:
Dan (April 6th): I like 3 the best, though it is GSIM specific given its reliance on nodes, but maybe that’s OK. It says that a level is defined both by the position of categories in a hierarchy and by a unifying concept. I suggest we add the word “unifying” to 3.2 – “the set is defined by a unifying concept”.
1 is on the right track, but it places the entire understanding on “position”, and that’s undefined. Going back to 3, position = the # of arcs back to the root. Plus, I like the unifying concept idea in 3, which 1 lacks. For example, the concept of industry sector is important for understanding the economy, and it is a level in ISIC and NAICS.
2 is just wrong-headed in my opinion, for the reliance on codes conveys a serious misunderstanding.
Flavio (April 9th): I think Levels are not exclusive to Statistical Classifications: Code Lists could also be nested with Levels, otherwise the Level should be associated to Classification Items only instead of Nodes.
I suggest a tweak of the Level definition as follows:
Group of Nodes in a hierarchical Node Set in which 1) each Node in the group is the same number of arcs away from the root Node in the hierarchy, and 2) the group is defined by a unifying concept.
I renamed “set” to “group” just to avoid confusion because of Node “Set”.
The final version (after conforming writing rules, etc.): set of Nodes in a hierarchical Node Set in which 1) each Node in the set is the same number of arcs away from the root Node in the hierarchy, and 2) the set is defined by a unifying Concept.
Discussion on Node - Node Set through emails:
Flavio (April 9th):
Object | Group | Definition | Explanatory Text |
---|---|---|---|
Category | Concepts | A Concept whose role is to extensionally define and measure a characteristic. | Categories for the Concept of sex include: Male, Female Note: An extensional definition is a description of a Concept by enumerating all of its subordinate Concepts under one criterion or sub-division. For example - the Noble Gases (in the periodic table) are extensionally defined by the set of elements including Helium, Neon, Argon, Krypton, Xenon, Radon. (ISO 1087-1) |
Category Item | Concepts | A type of Node exclusive to a Category Set that contains a single Category | A Category Item contains the meaning of a Category without any associated representation. |
Category Set | Concepts | A type of Node Set for grouping Categories via Category Items | |
Classification Item | Concepts | A type of Node exclusive to a Statistical Classification that combines a Category at a certain Level with a Code that represents it. | A Classification Item defines the content and borders of the associated Category. A Unit can be classified to one and only one item at each Level of a Statistical Classification. Categories are used to create sub-populations and must be mutually exclusive when contained into a Statistical Classification. |
Code | Concepts | Designation for a Category | |
Code Item | Concepts | A type of Node exclusive to a Code List that combines a Category with a Code that represents it. | A Code Item combines the meaning of the included Category with a Code representation. Codes are unique within their Code List. Example: M (Male) F (Female). |
Code List | Concepts | A type of Node Set for grouping pairs of Categories and their Codes via Code Items | Similar Code Lists can be grouped together (via the "relates to" relationship inherited from Node Set). A Code List provides a predefined set of permissible values for an Enumerated Value Domain |
Hi @FlavioRizzolo - thanks for the proposals, and sorry for reacting late.
I like how new definitions bring us back to Node-Node Set, and it is easier to understand Category/Code in the context of list of set. I reflected them into the model (see here).
One last remaining is that Code and Category Set are missing explanatory texts, I don't know if it was intentional, but their original explanatory texts seem still working, so I put them back (as below). Would this be okay?
Object | Group | Definition | Explanatory Text |
---|---|---|---|
Category Set | Concepts | type of Node Set for grouping Categories via Category Items | The Categories in a Category Set typically have no assigned Designations (Codes). For example: Male, Female |
Code | Concepts | Designation for a Category | Codes are unique within their Code List. Example: M (Male) F (Female). |
Code Item | Concepts | type of Node exclusive to a Code List that combines a Category with a Code that represents it. | A Code Item combines the meaning of the included Category with a Code representation. |
Please see this google doc updated based on the feedback from Metadata Glossary team. I would like to draw attention to:
1. Node and Node Set area:
2. Designation : Metadata Glossary task team proposed as below. But I wonder if "designation" is a "sign" itself rather than "association" (i.e., Designation is a sign denoting a Concept with which it is associated)
The name given to an object for identification.association of a Concept with a sign that denotes it.3. Classification Index Entry and Classification Index: it is hard to understand what they mean from the definition and explanatory text, at least for me..... :(
4. Datum: Metadata Glossary task team asked to come up with a new definition as "value" is a synonym. How about "value that was collected or derived" (borrowed from the explanatory text)
5. Level: Metadata Glossary task team proposed three candidate definitions: 1) position of a Category or a group of Categories within the hierarchy of a Statistical Classification; 2) identifiable position to which codes in a scheme of codes are related; 3) set of nodes in a statistical classification in which 1) each node in the set is the same number of arcs away from the root node in the hierarchy, and 2) the set is defined by a concept) - which one shall we go with?