ISO-TC211 / ISO19111

Revision of ISO19111 - Spatial referencing by coordinates.
2 stars 0 forks source link

DataType in table 45 #24

Closed marqh closed 7 years ago

marqh commented 7 years ago

The aim of this ticket is to propose a comment to the 19111 editing committee raising this issue for discussion.

19111 CD table 45 (p51) There may be an inconsistency or a mis-interpretation in the Data Type field usage in this table.
The data type for axisUnitID is 'unitOfMeasure'. Are the definitions of: temporalCount Integer temporalMeasure Measure temporalString DateTime

consistent with this? Is the dataType field the data type of the attribute or the data type of the coordinate values which are being referred to. This is the root of my mis-interpretation or inconsistency.

If this is the datatype of the attribute within the CRS definition, should an equivalent data type be defined for the temporalCount and temporalMeasure, that supports unit-like temporal quantities?
I'd suggest that, rather than providing further code tables and restrictions, this Data Type could be provided as 'String'. Alternatively, a code list could be provided of temporalQuantites with the allowed values expressed, e.g. microsecond second minute hour day week month year century dateTime

Associated with this table, I would propose that a note be added, to ensure clarity on the implication for the data type of the coordinate values. This should state that temporalCount values shall be integers temporalMeasure values shall be floats temporalString values shall be strings

If, however, my interpretation is wrong, and the data type is explicitly the data type of the coordinate values, I suggest that a note be added clarifying this interpretation and explaining how data type should be interpreted for this table. In most tables the data type appears to refer to the data type of the value for that attribute, not to coordinate values.

I have found myself concerned that this detail is not clear enough in the document to enable implementers to work with confidence.

jetgeo commented 7 years ago

See a preliminary figure bellow. I am not sure how to handle the constraints - what are the situations that require integer, what situations require floats etc? coordinate systems

marqh commented 7 years ago

@jetgeo the situations that require coordinate data type constraint are for OrdinalCS and TemporalCS. In some cases, as explicit constraint is provided on these CS instances.

For example,

I think that an optional attribute on TemporalCS and OrdinalCS, called CoordinateDataType and constrained to known data types provides for all these requirements

marqh commented 7 years ago

Martin suggested that TemporalCS may be a specialisation of OrdinalCS within this model, limiting the general case to 1 dimension and used for time; this may be useful to manage the constraint, or too many levels of specialisation, I am not sure

there is a specialisation already, perhaps only ExtendedCS is required, but this could be renamed to OrdinalCS

marqh commented 7 years ago

@jetgeo in order to support the requirements from the Coverage community, I think an extra entry is required in the pseudoUnit code list, to enable a generic, non-temporal case to be defined.

the known case for this is for index values, which have no sensible unit. I would suggest that None is a valid entry for this code list. I would prefer this to be explicit, i.e.

pseudoUnit = None

but i think it is acceptable to have the axisUnit allowed to be missing within the CoordinateSystemAxis instead

marqh commented 7 years ago

@jetgeo on the subject of the pseudounit code list. I am keen to provide the functionality for temporal periods, such as geological eras and other discrete quantities.

This seems like a general case that should be handled simply, rather than overly controlled

I am also conscious that we should not have a code list which just keeps growing to meet bespoke use cases

I think having a pseudoUnit of None is a useful way of explicitly stating that the coordinate values do not have a unit

This is also making the explicit statement that the CoordinateSystemAxis with pseudounit=None is a discrete axis, which is a useful and valuable statement for many scenarios, including geological era and index.

So, I think I am more in favour of an explicit entry in the PseudoUnit code table of None that a nullable unit/pseudounit

for coordinateDataType, i think this can be limited to string, integer, float I don't have reasons for more data types than this

this attribute should be able to be omitted, it is not mandatory

do you think this needs a code list, or is data type easily referenced from somewhere else?

many thanks mark

marqh commented 7 years ago

@jetgeo Based on conversation with Martin and Roger, we believe that modifications to this figure as shown below better represent the concepts and constraints required to meet these needs

cstemporal

We do not believe that the ExtendedCS is required at all, this is to be removed.

Additionally, in the CS CRS Associations diagram the EngineeringCS class, which is a union, needs to have OrdinalCS added to it.

marqh commented 7 years ago

@desruisseaux @RogerLott @jetgeo

Following discussion and consideration, I propose an amended PseudoUnit code list content. I have included attribute descriptions, informative notes for each entry

I believe that the code list for PseudoUnit should be mandated to have no overlap with the allowed entries for UnitOfMeasure by this standard; may we do this? This is because individual instances will only state axisUnit = <value> so we must enable the distinction between hour as a unit of measure and hour as a pseudoUnit, only from the key=value pair. Is this a valid constraint? If so, I propose a new distinction, that we use calendarHour as the pseudoUnit value, to distinguish between this and hour as an SI unit defined as an exact number of seconds

I believe that this the complete list, I propose this for further discussion

* calendarMinute
  * a minute quantity, defined within a calendar and clock system but not defined as a
    fixed number of SI seconds
* calendarHour
  * an hour quantity, defined within a calendar and clock system but not defined as a 
   fixed number of SI seconds
* calendarDay
  * a day quantity, defined within a calendar and clock system but not defined as a
    fixed number of SI seconds
* calendarMonth
  * a month quantity, defined within a calendar and clock system but not defined as a
    fixed number of SI seconds
* calendarYear
  * a year quantity, defined within a calendar and clock system but not defined as a
    fixed number of SI seconds
* no_unit
 * a quantity that is discrete; no interpretation of the magnitude of the difference
   between stated values should be inferred and no interpolation between values
   should be implemented.

The PseudoUnit Description text is proposed to be

PseudoUnits are provided to enable the definition of the quantity a coordinate is describing,
where this quantity does not have a valid SI unit of measure.

I also propose some further informative/clarification text, perhaps to be attached to the PseudoUnit as a note.

- the PseudoUnit calendarMinute is defined within a clock and calendar system, where different
  individual minute instances have a different length in seconds, due to
  leap second correction factors.
- the PseudoUnit no_unit may be used to describe a coordinate that is an index value for
  a discontinuous, size varying or otherwise arbitrary grid dimension
- the PseudoUnit no_unit is not recommended to define cell (pixel, voxel) index coordinates
  for an image providing image space coordinates in continuous image space.
  It is recommended to use the unitOfMeasure '1' in this case.

The sentence in C.3.2 Coordinate system axis

The AxisUnit class contains four attributes. One of the temporalCount temporalMeasure or
temporalString quantities is used for a temporal CS axis. AxisUnitID is used for non-temporal
 coordinate system axes.

needs to be amended, based on these changes. Perhaps just removed, as the simplification in this ticket does not need this level of explanation. The last sentence is now wrong, given this proposed approach, as an AxisUnitID, e.g. second, may be used for a temporalCS axis

desruisseaux commented 7 years ago

The overlap between UnitOfMeasure and PseudoUnit is language-specific. It exists in languages where axisUnit = <value> is all the information provided, but does not exist in strongly-typed languages like Java or C++ where the type has the effect of a name space. I think that UML is more like strongly-typed languages.

Nevertheless, names like calendarMonth add clarity, so I have no objection. But I wonder if calendarMinute and calendarSecond should be clockMinute and clockSecond instead?

The rest of the proposal looks fine to me.

RogerLott commented 7 years ago

I agree that an adjectival prefix to the pseudounits adds clarity. I have no preference for calendar or clock, If I were forced to choose between these two options I guess clock is more obviously associated with time, but its application to months that have no subdivisions may be hard to grasp.

jetgeo commented 7 years ago

About the ExtendedCS class: The figure @marqh presented above will not work. The OrdinalCS class does not have a coordinateDataType attribute in that figure. The reason I added the ExtendedCS class was to give the coordinateDataType attribute to both OrdinalCS and TemporalCS. An alternative would be to have OrdinalCS as a subtype of TemporalCS.

desruisseaux commented 7 years ago

We discussed about defining OrdinalCS as a sub-type of TemporalCS exactly for that reason. But the proposal became to keep those two types separated, have the coordinateType attribute only in TemporalCS, and to set on constraint on OrdinalCS saying that the coordinate type must be integer (no other type allowed). Maybe constraint: coordinateType=integer is not the proper way to express that constraint since OrdinalCS has indeed no coordinateType attribute, but I guess that if we can not write the constraint that way we could write it in some other way, maybe plain text.

desruisseaux commented 7 years ago

About prefix to pseudoUnit, actually I do not propose to replace calendar by clock everywhere, but to keep calendar for year, month and day and use the clock prefix only for hour and minute.

jetgeo commented 7 years ago

OK, here's a suggested modified model. The requirement that the data type for coordinates in a OrdinalCS can either be handled as shown in the model, or as a part of the class definition.

The list of pseudoUnits is much shorter now - should the values microsecond, second, week, century and dateTime have been kept in the list?

coordinate systems

desruisseaux commented 7 years ago

Thanks for the update. Isn't coordinateType: CoordinateDataType = integer meaning that OrdinalCS has a coordinateType whose default value is integer? Maybe the idea that OrdinalCS has no additional attribute (compared to its CoordinateSystem parent class) but has coordinates restricted to integers can only be specified in the text?

About the reduced list of PseudoUnit, we proposed to omit second, microsecond, etc. because a clockSecond pseudo-unit would be identical to SI second unit. A clockMinute may be different than SI minute because of leap second, but I'm not aware of leap micro-second that would make clockSecond different than SI second.

dateTime has been removed because it is covered by CoordinateDataType code list. The previous enumeration was merging units of measurement and data types together; the new UML separate those two concepts.

For calendarWeek and calendarCentury, I do not know. But since PseudoUnit is a code list, users can add their own if they wish.

RogerLott commented 7 years ago
  1. The three attributes in the CoordinateDataType class should be integer, real and dateTime. (no characterString).

  2. @jetgeo please consider the first paragraph in the previous post from @desruisseaux. The intent is that: (a) the axisUnit of the axis of a TemporalCS shall be either a pseudoUnit as an integer or a pseudoUnit as a real number or a dateTime. The model [including the dateTime correction in (1) above] now seems to facilitate this. (b) for every axis of an OrdinalCS the axisUnit shall be integer. This is not to be a default, it is the only option.

  3. @jetgeo asked:

    The list of pseudoUnits is much shorter now - should the values microsecond, second, week, century and dateTime have been kept in the list? Microsecond and second are not required as there is no problem in them being converted to SI unit. @marqh advises that the meteorological community do not require week or century. It is this community that is driving the shaping of TemporalCS and PseudoUnit. With the correction in (1) above, dateTime is available to TemporalCS through the CoordinateDataType class. So none of these values microsecond, second, week, century and dateTime are required in the PseudoUnit class.

jetgeo commented 7 years ago

OK, I am a bit confused here now. Must admit that I am not quite following you. @desruisseaux and @RogerLott is correct that the model shows an initial value for coordinateDataType, not a constraint. That would have to be added in addition. Or we can just write the constraint in the definition, but it would not be as visible then. But @RogerLott writes that it the axisUnit that shall be integer, I thought we were talking about the coordinate values here, and I thought that was something else than the AxisUnit? Sorry if I'm mixing things up.

RogerLott commented 7 years ago

Sorry, me not thinking straight. We are indeed trying to constrain the coordinate data type for OrdinalCS to be only integer. To do this it is necessary for an ordinal CS's axis to use a pseudo-unit of no unit. But @marqh said "- the PseudoUnit no_unit is not recommended to define cell (pixel, voxel) index coordinates for an image providing image space coordinates in continuous image space. It is recommended to use the unitOfMeasure '1' in this case." Is this possible - 1 what? Does not Unit of Measure have to have a unit and a conversion to an SI unit of the same type? If it is not appropriate for the axis of an ordinalCS to use axisUnitID then it will have to use a pseudo-unit. The proposed @marqh definition of no_unit "a quantity that is discrete; no interpretation of the magnitude of the difference between stated values should be inferred and no interpolation between values should be implemented" seems fine, although I do not find "no-unit" particularly intuitive. Could this be changed to "discreteQuantity" to fit the definition in a positive way?

So does this mean that the OrdinalCS class does not need the attribute +coordinateType at all, but does need a constraint (CoordinateSystemAxis.AxisUnit.PseudoUnit = [no_unit / discreteQuantity]) ?

marqh commented 7 years ago

@desruisseaux

About prefix to pseudoUnit, actually I do not propose to replace calendar by clock everywhere, but to keep calendar for year, month and day and use the clock prefix only for hour and minute.

I prefer calendarHour to clockHour as it the calendar that defines the variable length: "which hour on which day contains a leap second" is defined by the calendar.

@RogerLott I think that a unit of measure of '1' is a recognised and used unit of measure within SI. Examples include kg/kg = 1 and in mandating a unit of '1' for the exponent if something is raised to a power provided by a quantity. (i think there are more cases)

I think that 'discreteQuantity' instead of 'no_unit' may well be more clear and a better description/name.

I don't think that an OrdinalCS must have this pseudoUnit of discreteQuantity though. I think the unit of '1' used by an ordinalCS is a useful way of distinguishing image pixel space from the generic grid concept. I think that we can be useful by not being too restrictive in these cases, which are already edge cases in terms of referencing functionality. I think we could explore this nuance more as we write the advice for referencing by coordinates for coverages over the coming weeks.

Based on this, I think OrdinalCS should be constrained to be used by coordinates of type 'integer', but should not have its unit constrained. I would like to work this through into a few coverage style cases to challenge or validate this view.

RogerLott commented 7 years ago

@jetgeo I think we should do as @marqh suggests, keep this issue open whilst we work through a few use cases. However in the mean time two changes are required to the above diagram: i) Inthe PseudoUnit class change the attribute +no_unit to +discreteQuntity ii) do something with the OrdinalCS class to have it constrained to have a coordinate type of integer (only permissable type).

jetgeo commented 7 years ago

Something like this?

coordinate systems

jetgeo commented 7 years ago

Set all pseudoUnits (except descreteQuantity) to "calendar....", after discussion wit @RogerLott . Closing this issue.

coordinate systems