Closed marqh closed 7 years ago
See a preliminary figure bellow. I am not sure how to handle the constraints - what are the situations that require integer, what situations require floats etc?
@jetgeo the situations that require coordinate data type constraint are for OrdinalCS and TemporalCS. In some cases, as explicit constraint is provided on these CS instances.
For example,
I think that an optional attribute on TemporalCS and OrdinalCS, called CoordinateDataType and constrained to known data types provides for all these requirements
Martin suggested that TemporalCS may be a specialisation of OrdinalCS within this model, limiting the general case to 1 dimension and used for time; this may be useful to manage the constraint, or too many levels of specialisation, I am not sure
there is a specialisation already, perhaps only ExtendedCS is required, but this could be renamed to OrdinalCS
@jetgeo in order to support the requirements from the Coverage community, I think an extra entry is required in the pseudoUnit code list, to enable a generic, non-temporal case to be defined.
the known case for this is for index values, which have no sensible unit. I would suggest that None
is a valid entry for this code list. I would prefer this to be explicit, i.e.
pseudoUnit = None
but i think it is acceptable to have the axisUnit allowed to be missing within the CoordinateSystemAxis instead
@jetgeo on the subject of the pseudounit code list. I am keen to provide the functionality for temporal periods, such as geological eras and other discrete quantities.
This seems like a general case that should be handled simply, rather than overly controlled
I am also conscious that we should not have a code list which just keeps growing to meet bespoke use cases
I think having a pseudoUnit of None is a useful way of explicitly stating that the coordinate values do not have a unit
This is also making the explicit statement that the CoordinateSystemAxis with pseudounit=None is a discrete axis, which is a useful and valuable statement for many scenarios, including geological era and index.
So, I think I am more in favour of an explicit entry in the PseudoUnit code table of None that a nullable unit/pseudounit
for coordinateDataType, i think this can be limited to string, integer, float I don't have reasons for more data types than this
this attribute should be able to be omitted, it is not mandatory
do you think this needs a code list, or is data type easily referenced from somewhere else?
many thanks mark
@jetgeo Based on conversation with Martin and Roger, we believe that modifications to this figure as shown below better represent the concepts and constraints required to meet these needs
We do not believe that the ExtendedCS
is required at all, this is to be removed.
Additionally, in the CS CRS Associations
diagram the EngineeringCS
class, which is a union, needs to have OrdinalCS
added to it.
@desruisseaux @RogerLott @jetgeo
Following discussion and consideration, I propose an amended PseudoUnit code list content. I have included attribute descriptions, informative notes for each entry
I believe that the code list for PseudoUnit should be mandated to have no overlap with the allowed entries for UnitOfMeasure by this standard; may we do this?
This is because individual instances will only state
axisUnit = <value>
so we must enable the distinction between hour
as a unit of measure and hour
as a pseudoUnit, only from the key=value pair.
Is this a valid constraint?
If so, I propose a new distinction, that we use calendarHour
as the pseudoUnit value, to distinguish between this and hour
as an SI unit defined as an exact number of seconds
I believe that this the complete list, I propose this for further discussion
* calendarMinute
* a minute quantity, defined within a calendar and clock system but not defined as a
fixed number of SI seconds
* calendarHour
* an hour quantity, defined within a calendar and clock system but not defined as a
fixed number of SI seconds
* calendarDay
* a day quantity, defined within a calendar and clock system but not defined as a
fixed number of SI seconds
* calendarMonth
* a month quantity, defined within a calendar and clock system but not defined as a
fixed number of SI seconds
* calendarYear
* a year quantity, defined within a calendar and clock system but not defined as a
fixed number of SI seconds
* no_unit
* a quantity that is discrete; no interpretation of the magnitude of the difference
between stated values should be inferred and no interpolation between values
should be implemented.
The PseudoUnit Description text is proposed to be
PseudoUnits are provided to enable the definition of the quantity a coordinate is describing,
where this quantity does not have a valid SI unit of measure.
I also propose some further informative/clarification text, perhaps to be attached to the PseudoUnit as a note.
- the PseudoUnit calendarMinute is defined within a clock and calendar system, where different
individual minute instances have a different length in seconds, due to
leap second correction factors.
- the PseudoUnit no_unit may be used to describe a coordinate that is an index value for
a discontinuous, size varying or otherwise arbitrary grid dimension
- the PseudoUnit no_unit is not recommended to define cell (pixel, voxel) index coordinates
for an image providing image space coordinates in continuous image space.
It is recommended to use the unitOfMeasure '1' in this case.
The sentence in
C.3.2 Coordinate system axis
The AxisUnit class contains four attributes. One of the temporalCount temporalMeasure or
temporalString quantities is used for a temporal CS axis. AxisUnitID is used for non-temporal
coordinate system axes.
needs to be amended, based on these changes. Perhaps just removed, as the simplification in this ticket does not need this level of explanation. The last sentence is now wrong, given this proposed approach, as an AxisUnitID, e.g. second, may be used for a temporalCS axis
The overlap between UnitOfMeasure
and PseudoUnit
is language-specific. It exists in languages where axisUnit = <value>
is all the information provided, but does not exist in strongly-typed languages like Java or C++ where the type has the effect of a name space. I think that UML is more like strongly-typed languages.
Nevertheless, names like calendarMonth
add clarity, so I have no objection. But I wonder if calendarMinute
and calendarSecond
should be clockMinute
and clockSecond
instead?
The rest of the proposal looks fine to me.
I agree that an adjectival prefix to the pseudounits adds clarity. I have no preference for calendar or clock, If I were forced to choose between these two options I guess clock is more obviously associated with time, but its application to months that have no subdivisions may be hard to grasp.
About the ExtendedCS class: The figure @marqh presented above will not work. The OrdinalCS class does not have a coordinateDataType attribute in that figure. The reason I added the ExtendedCS class was to give the coordinateDataType attribute to both OrdinalCS and TemporalCS. An alternative would be to have OrdinalCS as a subtype of TemporalCS.
We discussed about defining OrdinalCS
as a sub-type of TemporalCS
exactly for that reason. But the proposal became to keep those two types separated, have the coordinateType
attribute only in TemporalCS
, and to set on constraint on OrdinalCS
saying that the coordinate type must be integer (no other type allowed). Maybe constraint: coordinateType=integer
is not the proper way to express that constraint since OrdinalCS
has indeed no coordinateType
attribute, but I guess that if we can not write the constraint that way we could write it in some other way, maybe plain text.
About prefix to pseudoUnit
, actually I do not propose to replace calendar
by clock
everywhere, but to keep calendar
for year, month and day and use the clock
prefix only for hour and minute.
OK, here's a suggested modified model. The requirement that the data type for coordinates in a OrdinalCS can either be handled as shown in the model, or as a part of the class definition.
The list of pseudoUnits is much shorter now - should the values microsecond, second, week, century and dateTime have been kept in the list?
Thanks for the update. Isn't coordinateType: CoordinateDataType = integer
meaning that OrdinalCS
has a coordinateType
whose default value is integer
? Maybe the idea that OrdinalCS
has no additional attribute (compared to its CoordinateSystem
parent class) but has coordinates restricted to integers can only be specified in the text?
About the reduced list of PseudoUnit
, we proposed to omit second, microsecond, etc. because a clockSecond
pseudo-unit would be identical to SI second unit. A clockMinute
may be different than SI minute because of leap second, but I'm not aware of leap micro-second that would make clockSecond
different than SI second.
dateTime
has been removed because it is covered by CoordinateDataType
code list. The previous enumeration was merging units of measurement and data types together; the new UML separate those two concepts.
For calendarWeek
and calendarCentury
, I do not know. But since PseudoUnit
is a code list, users can add their own if they wish.
The three attributes in the CoordinateDataType class should be integer, real and dateTime. (no characterString).
@jetgeo please consider the first paragraph in the previous post from @desruisseaux. The intent is that: (a) the axisUnit of the axis of a TemporalCS shall be either a pseudoUnit as an integer or a pseudoUnit as a real number or a dateTime. The model [including the dateTime correction in (1) above] now seems to facilitate this. (b) for every axis of an OrdinalCS the axisUnit shall be integer. This is not to be a default, it is the only option.
@jetgeo asked:
The list of pseudoUnits is much shorter now - should the values microsecond, second, week, century and dateTime have been kept in the list? Microsecond and second are not required as there is no problem in them being converted to SI unit. @marqh advises that the meteorological community do not require week or century. It is this community that is driving the shaping of TemporalCS and PseudoUnit. With the correction in (1) above, dateTime is available to TemporalCS through the CoordinateDataType class. So none of these values microsecond, second, week, century and dateTime are required in the PseudoUnit class.
OK, I am a bit confused here now. Must admit that I am not quite following you. @desruisseaux and @RogerLott is correct that the model shows an initial value for coordinateDataType, not a constraint. That would have to be added in addition. Or we can just write the constraint in the definition, but it would not be as visible then. But @RogerLott writes that it the axisUnit that shall be integer, I thought we were talking about the coordinate values here, and I thought that was something else than the AxisUnit? Sorry if I'm mixing things up.
Sorry, me not thinking straight. We are indeed trying to constrain the coordinate data type for OrdinalCS to be only integer. To do this it is necessary for an ordinal CS's axis to use a pseudo-unit of no unit. But @marqh said "- the PseudoUnit no_unit is not recommended to define cell (pixel, voxel) index coordinates for an image providing image space coordinates in continuous image space. It is recommended to use the unitOfMeasure '1' in this case." Is this possible - 1 what? Does not Unit of Measure have to have a unit and a conversion to an SI unit of the same type? If it is not appropriate for the axis of an ordinalCS to use axisUnitID then it will have to use a pseudo-unit. The proposed @marqh definition of no_unit "a quantity that is discrete; no interpretation of the magnitude of the difference between stated values should be inferred and no interpolation between values should be implemented" seems fine, although I do not find "no-unit" particularly intuitive. Could this be changed to "discreteQuantity" to fit the definition in a positive way?
So does this mean that the OrdinalCS class does not need the attribute +coordinateType at all, but does need a constraint (CoordinateSystemAxis.AxisUnit.PseudoUnit = [no_unit / discreteQuantity]) ?
@desruisseaux
About prefix to pseudoUnit, actually I do not propose to replace calendar by clock everywhere, but to keep calendar for year, month and day and use the clock prefix only for hour and minute.
I prefer calendarHour to clockHour as it the calendar that defines the variable length: "which hour on which day contains a leap second" is defined by the calendar.
@RogerLott I think that a unit of measure of '1' is a recognised and used unit of measure within SI. Examples include kg/kg = 1 and in mandating a unit of '1' for the exponent if something is raised to a power provided by a quantity. (i think there are more cases)
I think that 'discreteQuantity' instead of 'no_unit' may well be more clear and a better description/name.
I don't think that an OrdinalCS must have this pseudoUnit of discreteQuantity though. I think the unit of '1' used by an ordinalCS is a useful way of distinguishing image pixel space from the generic grid concept. I think that we can be useful by not being too restrictive in these cases, which are already edge cases in terms of referencing functionality. I think we could explore this nuance more as we write the advice for referencing by coordinates for coverages over the coming weeks.
Based on this, I think OrdinalCS should be constrained to be used by coordinates of type 'integer', but should not have its unit constrained. I would like to work this through into a few coverage style cases to challenge or validate this view.
@jetgeo I think we should do as @marqh suggests, keep this issue open whilst we work through a few use cases. However in the mean time two changes are required to the above diagram: i) Inthe PseudoUnit class change the attribute +no_unit to +discreteQuntity ii) do something with the OrdinalCS class to have it constrained to have a coordinate type of integer (only permissable type).
Something like this?
Set all pseudoUnits (except descreteQuantity) to "calendar....", after discussion wit @RogerLott . Closing this issue.
The aim of this ticket is to propose a comment to the 19111 editing committee raising this issue for discussion.
19111 CD table 45 (p51) There may be an inconsistency or a mis-interpretation in the Data Type field usage in this table.
The data type for axisUnitID is 'unitOfMeasure'. Are the definitions of: temporalCount Integer temporalMeasure Measure temporalString DateTime
consistent with this? Is the dataType field the data type of the attribute or the data type of the coordinate values which are being referred to. This is the root of my mis-interpretation or inconsistency.
If this is the datatype of the attribute within the CRS definition, should an equivalent data type be defined for the temporalCount and temporalMeasure, that supports unit-like temporal quantities?
I'd suggest that, rather than providing further code tables and restrictions, this Data Type could be provided as 'String'. Alternatively, a code list could be provided of temporalQuantites with the allowed values expressed, e.g. microsecond second minute hour day week month year century dateTime
Associated with this table, I would propose that a note be added, to ensure clarity on the implication for the data type of the coordinate values. This should state that temporalCount values shall be integers temporalMeasure values shall be floats temporalString values shall be strings
If, however, my interpretation is wrong, and the data type is explicitly the data type of the coordinate values, I suggest that a note be added clarifying this interpretation and explaining how data type should be interpreted for this table. In most tables the data type appears to refer to the data type of the value for that attribute, not to coordinate values.
I have found myself concerned that this detail is not clear enough in the document to enable implementers to work with confidence.