Parameter: content - Githubissues

Sorry.. this is a long one.

The Source -> Parameter relation is very similar to the Cube model’s NDPoint -> Observable relation. In Cube, each Observable owns a Measure instance, and adds knowledge of whether this is ‘dependent’ or ‘independent’ data. But here, each Parameter ends up taking the place of VODML role and type from a formally modeled Source object.
ie: instead of Source.position:Position[*] we have Source with Parameter{semantic=“source position”, ucd=“pos.eq”}

I understand that this model is trying to be generic, and specifically NOT model Source explicitly, so I think the Source has a collection of Property-s is a good mechanism. But instead of this providing access to various types of Properties, it has become something that lets you build proxies for things which are not formally modeled, which I think is outside the model scope.

Parameter.semantic:

Replaces VODML role (ie: attribute/relation name)
- I’m concerned that this approach of replacing model elements with semantic vocabularies is going to be a maintenance problem for us, and an implementation problem for clients. If the Parameter.semantic vocabulary can be “totally free as long as it is published” then there is nothing fixed for clients to queue off of to know how to interpret any given Parameter. In other words, different vocabularies can/will define different terms for the same concept which is the sort of problem which our standards are supposed to be solving.

Parameter.ucd:

Replaces VODML type (ie: expected Type of the ‘value’)
Has the benefit of facilitating the use of concepts with no formally modeled Measure type; “phys.magField”, “phot.mag”..
- I’ll note that I believe this is Markus’ argument for not having specialized Measure types at all, but only a single Measurement with a semantic tag to identify its nature (ala ucd).
- In my opinion, this form may be fine for a serialization, but is VERY difficult to specify dependencies/constraints in the models
  - If ucd = “pos.eq” then associated Coordinate SpaceFrame MUST have referenceFrame=“ICRS|FK4|FK5” and Spherical coordinate space
Has the vulnerability of being a consistency problem
- If ucd = “pos.eq” and the measure is “meas:Position but in GALACTIC”, the client will have to handle the inconsistency
- If ucd = “phot.mag” and the measure is “meas:GenericMeasure”, the client STILL needs to do all the work to determine if the GenericMeasure content is compatible with “phot.mag” type. If they are doing that, then they can identify it as a “phot.mag” without the prompt. NOTE: doing this MAY mean drilling down to the VOTable element, and checking the UCD on the PARAM|FIELD.. noticing that it is “phot.mag”
Having the ucd here does not solve the GenericMeasure problem, since it does not help identify dependent metadata
- If Parameter.ucd = “phot.mag” or “phot.flux” there should/must be an associated “photDM.PhotCal” instance.. how do they know that? where would they find it? This exact scenario is in the TimeSeries workshop use case.
- I don’t think these sorts of associations are for this model to solve. It is basically constructing model elements.

Parameter.measure:

Is the parameter value, which may or may not be of the type identified in the ucd
- This can be a good thing ( qualifying GenericMeasure as “phot.flux” or “phys.magField” )
- Or a consistency problem ( ucd=“pos.eq” with measure=Time )
There is only 1 option here.. Parameter contains Measure
- The model text describes that there are other kinds of parameters ( flags, assigned states, classifications ). By only having a Measurement option, the model has improperly extended Measure and Coordinate for these data. That will be another ticket, but I think there is work to do here on how to handle non-measure properties.

Proposal:

I would suggest splitting the Parameter into sub-classes
- Parameter: abstract parent. contains reference to associated parameter if that is needed (haven’t looked into that use case)
- PhysicalParameter: extends Parameter, contains Measure instance
- Classification: extends Parameter, contains a vocabulary literal (VocabularyTerm)
  - removes need for VocabMeasure and VocabCoordinate which are not proper extensions of those models
- Flag: extends Parameter, contains what basically amounts to a user-defined enumeration value
  - value = integer (OK to start, but in Chandra we have bit array flags where each bit represents a different issue )
  - options = pointer to what is currently defined as FlagSys
  - Removes FlagCoord, FlagSys becomes local class as part of Flag Property spec, not extension of CoordSys
None of these would have ‘semantic’ or ‘ucd’ attributes to qualify the value.
- In the PhysicalParameter we’d need to have a discussion on how to handle the complex unmodeled Measure types.
  - The ‘simple’ ones, can be handled by clients interpreting units and/or the underlying VOTable element ucd.

Sorry.. this is a long reply.

The scope of Mango is to provide a way for clients to "understand" all properties attached to one given dataset. The diversity of the datasets possibly mapped onto Mango is so huge that we cannot consider build a classical model binding sources with a predetermined set of properties with specific roles .

Too many different types of properties to consider having one class (VODML type) for each
Too many different property roles to consider having one VODML role for each.

To work this around MANGO considers a source as an open set of properties (not speaking about associated data)

Each property is made with
- one semantic block giving the parameter role
- one measure giving the value of the measure in a broad sense.
- links to other properties

To make this working we need

The semantic block to be able qualify most of the properties on the market.
All measure objects to be built with the same pattern in a way that clients will be able to parse them with generic code (up to certaine extent). In term of modeling, all measures (in a broad sense) must extend a common ancestor that implements that pattern. This is applicable for all measures (physical measures, computed values, flags and whatever)

The way it has been done:

semantic block: UCDs are used as primary identifiers for measure roles.
- UCDs cover by construction all Vizier columns and there is no reason built another semantic scheme.
- When UCDs are too loose, we can use a vocabulary to refine the role (we haven't worked a lot on the semantic tags yet)
- When UCD+semantic is too loose one can use a textuel description which will likley ends up in some warning popup.
measure: All measures are built the MCT top level pattern:
- For MCT a measure is basicaly a coordinate + an error
- A coordinate is a value + a description of the coordinate system.
- Measures can come without errors as well as coordinates can come without coordinate system.
- We just extend this pattern to parameters that are usually not seen as measure. For instance a quality flag is a measure with an integer value without error and valid in a coordinate systeme made of one discret axes (see mango:extcoords.flagsys)

If things are well done, we should be able to propose a MANGO API looking like this:

instance = Mango.get_instance(votable)
# Get all semantic blocks
available_properties = instance.get_properties()
# Get a specific measure
measure = instance.get_property("my.nice.ucd", vocabulary=None, desription=None)
error = measure.get_error()
value = measure.get_value()
coord_system = measure.get_coordsys()
frame = coord_system.get_frame()

Valid for whatever measure which is rather cool.

I took a bit of time to recap this because I would really like to avoid amending the model in a way that breaks this homongeneity

Parameter.semantic:

You are right. As the content is totaly free, this field cannot be assumed to give a role It must be seen as a secondary qualifier, something helping some clients. I think I should make it optional and the spec must be refined as well

Parameter.ucd:

Replaces VODML type (ie: expected Type of the ‘value’)
Has the benefit of facilitating the use of concepts with no formally modeled Measure type; “phys.magField”, “phot.mag”..
    I’ll note that I believe this is Markus’ argument for not having specialized Measure types at all, but only a single Measurement with a semantic tag to identify its nature (ala ucd).

I'm somewhere in between Markus and you: UCDs for roles but we need a model for the structuer of the measures

    In my opinion, this form may be fine for a serialization, but is VERY difficult to specify dependencies/constraints in the models
        If ucd = “pos.eq” then associated Coordinate SpaceFrame MUST have referenceFrame=“ICRS|FK4|FK5” and Spherical coordinate space
Has the vulnerability of being a consistency problem
    If ucd = “pos.eq” and the measure is “meas:Position but in GALACTIC”, the client will have to handle the inconsistency

It is true. I would say this is the cost for the flexibility. This problem arise each time the same thing has more the one identifier (UCD + dmtype here). Note that a classical model does not prevent this, it just shift the risk onto the mapping (you can map pos.eq on a galactical position)

    If ucd = “phot.mag” and the measure is “meas:GenericMeasure”, the client STILL needs to do all the work to determine if the GenericMeasure content is compatible with “phot.mag” type. If they are doing that, then they can identify it as a “phot.mag” without the prompt. NOTE: doing this MAY mean drilling down to the VOTable element, and checking the UCD on the PARAM|FIELD.. noticing that it is “phot.mag”
Having the ucd here does not solve the GenericMeasure problem, since it does not help identify dependent metadata
    If Parameter.ucd = “phot.mag” or “phot.flux” there should/must be an associated “photDM.PhotCal” instance.. how do they know that? where would they find it? This exact scenario is in the TimeSeries workshop use case.

Mango has no concept like (in)dependent metadata.

I'm not sure to follow you. Being inspired by the above code snippet you could do a checking this like this:

if generic_measure.get_coordsys()["@dmtype"] == "PhotometricSys":
    print("this measure really looks like a photometric measure"

What can we do if the curator mixes up randomly data ucds and classes? Mango is a very flexible model designed to map various data, but it relies on the thoroughness of the data provider. This can be seen as a weakness, but to me, the benefit/cost ratio is more that positive.

Parameter.measure:

Is the parameter value, which may or may not be of the type identified in the ucd
    This can be a good thing ( qualifying GenericMeasure as “phot.flux” or “phys.magField” )
    Or a consistency problem ( ucd=“pos.eq” with measure=Time )

This can easily be checked (see previous post)

There is only 1 option here.. Parameter contains Measure
    The model text describes that there are other kinds of parameters ( flags, assigned states, classifications ). By only having a Measurement option, the model has improperly extended Measure and Coordinate for these data. That will be another ticket, but I think there is work to do here on how to handle non-measure properties.

I admit that the way I extend Measure/Coordinate might look odd, but I claim it is valid. What I'm doing with flags is not that different of what you propose for the Polarimetry.

Mango needs an interface common for all measures (including flags, assigned states, classifications ). We can imagine an intermediate layer providing that interface for different category of measure, but what woul be the gain? I admit however that the term measure is not then better choice in this case. Nothing better found right now.

Proposal:

I would suggest splitting the Parameter into sub-classes
    Parameter: abstract parent. contains reference to associated parameter if that is needed (haven’t looked into that use case)
    PhysicalParameter: extends Parameter, contains Measure instance
    Classification: extends Parameter, contains a vocabulary literal (VocabularyTerm)
        removes need for VocabMeasure and VocabCoordinate which are not proper extensions of those models
    Flag: extends Parameter, contains what basically amounts to a user-defined enumeration value
        value = integer (OK to start, but in Chandra we have bit array flags where each bit represents a different issue )
        options = pointer to what is currently defined as FlagSys
        Removes FlagCoord, FlagSys becomes local class as part of Flag Property spec, not extension of CoordSys

This may work, but I do no see the benefit of such a complication.

None of these would have ‘semantic’ or ‘ucd’ attributes to qualify the value.

There are thousand of different UCDs, we need them. The information carried by MCT classes is not enough.

    In the PhysicalParameter we’d need to have a discussion on how to handle the complex unmodeled Measure types.
        The ‘simple’ ones, can be handled by clients interpreting units and/or the underlying VOTable element ucd.

I prefer not to make a distinction between simple and not simple measures
At the model level, I prefer not to rely on the VOTable element.
- Notice that the model can be used in others contexts (FITS, CVS, REST endpoint)
- This is the job of the mapping to say whether a UCD must be taken in a FIELD or not.
Allowing the model to override existing UCDs is necessary in my opinion.

Laurent,

The scope of Mango is to provide a way for clients to "understand" all properties attached to one given dataset

Paraphrased, I'd say the goal is "to model Source and its various Properties"

To work this around MANGO considers a source as an open set of properties

Right.. so at first level you have Source has a collection of Property-s

But then, when you look at the kinds of properties, there are at least 2 inherently different catagories

those based on physical entities, either measured or derived (Position, Time, Flux, Magnitude, HardnessRatio..). Whose values reside in a particular coordinate space, etc
those which identify what kind of thing (Source) we have (SpectralType, CelestialClass, LuminosityClass, MorphologyClass). Whose values are are just an entry from a controlled vocabulary.

I don't think you should necessarily expect the interface to these to be the same..

MANGO API code block example

In my opinion, that thread should utterly fail for the 2nd type. It makes no sense to examine the coordSys or RefFrame or Errors of a MorphologyClass. They simply don't apply.

I admit that the way I extend Measure/Coordinate might look odd, but I claim it is valid.

I assure you that, Measurement was not intended to be extended in this way. Keep in mind that the Measurement model is designed to support the Cube case, which has a lot of parallels with this one. It too has Quality Flags and other sorts of qualifiers which are not covered by the Measurement model, because they are not within its scope.

What I'm doing with flags is not that different of what you propose for the Polarization (sp).

This is true.. I was very uncertain about including the enumerated PolarizationState in the Measurement model for this very reason.. it is technically not a measured entity, but an assigned state. I included it because

we haven't really broached this subject in the DM group until now
I feel it is more closely tied to a physical property than other classifications/flags
I expect PolarizationFraction to become a use case in the future, and would like to allow both to be under the same Polarization type.

Given this discussion, I could be convinced to reconsider that choice.

quality flag discussion

The usage threads related to Source data often include evaluating the quality of, or usefulness of the Properties. These are externally determined and assigned to a particular Property (or Source record as a whole?). I assert that this is a more important relation than a simple 'Associated Property'. The flag has little meaning on its own, but is a qualifier on the Property to which it is assigned.

This is why I suggest the flag is not a Property itself (you aren't going to perform analysis on the Flag), and would be better modeled as an attribute on Property (maybe just PhysicalProperty) so that there is a common access point to this very important qualifier.

re: suggested restructuring of Parameter/Property and non-coordinate Measure-s "This may work, but I do no see the benefit of such a complication."

The suggested changes:

adds 2 Property types (1-abstract head, 1-Classification type, renaming current Parameter as PhysicalProperty )
cuts 14 classes extending Meas/Coords elements

Paraphrased, I'd say the goal is "to model Source and its various Properties"

This rather a semantic shift. I prefer to keep my "model for source data" with an acronym standing for "Model for Annotating Generic Objects"

But then, when you look at the kinds of properties, there are at least 2 inherently different catagories
those based on physical entities, either measured or derived (Position, Time, Flux, Magnitude, HardnessRatio..). Whose values reside in a particular coordinate space, etc
those which identify what kind of thing (Source) we have (SpectralType, CelestialClass, LuminosityClass, MorphologyClass). Whose values are are just an entry from a controlled vocabulary.
I don't think you should necessarily expect the interface to these to be the same..

Since there is no way for me to sell my generalisation of Syst/Frame, I would say that is a promising approach.

I feel it is more closely tied to a physical property than other classifications/flags

Polarization is clearly a physical property expressing a coordinate system that is a state enumeration. I'm completly at ease with this.

This is why I suggest the flag is not a Property itself (you aren't going to perform analysis on the Flag), and would be better modeled as an attribute on Property (maybe just PhysicalProperty) so that there is a common access point to this very important qualifier.

I don not agree. Either meaning, scope or cardinality of flags are too much flexible to consider them has an attribute with a predetermined role. I really prefer the actual semanticless association,

The suggested changes: I'll post a sketch this proposal in a new issue.

ivoa-std / MANGO

Parameter: content #26