Closed mcdittmar closed 1 month ago
Sorry.. this is a long reply.
The scope of Mango is to provide a way for clients to "understand" all properties attached to one given dataset. The diversity of the datasets possibly mapped onto Mango is so huge that we cannot consider build a classical model binding sources with a predetermined set of properties with specific roles .
To work this around MANGO considers a source as an open set of properties (not speaking about associated data)
semantic block
giving the parameter role measure
giving the value of the measure in a broad sense.To make this working we need
semantic block
to be able qualify most of the properties on the market.measure
objects to be built with the same pattern in a way that clients will be able to parse them with generic code (up to certaine extent). In term of modeling, all measures
(in a broad sense) must extend a common ancestor that implements that pattern. This is applicable for all measures (physical measures, computed values, flags and whatever)The way it has been done:
semantic block
: UCDs are used as primary identifiers for measure
roles.
measure
: All measures are built the MCT top level pattern:
measure
. For instance a quality flag is a measure
with an integer value without error and valid in a coordinate systeme made of one discret axes (see mango:extcoords.flagsys)If things are well done, we should be able to propose a MANGO API looking like this:
instance = Mango.get_instance(votable)
# Get all semantic blocks
available_properties = instance.get_properties()
# Get a specific measure
measure = instance.get_property("my.nice.ucd", vocabulary=None, desription=None)
error = measure.get_error()
value = measure.get_value()
coord_system = measure.get_coordsys()
frame = coord_system.get_frame()
Valid for whatever measure which is rather cool.
I took a bit of time to recap this because I would really like to avoid amending the model in a way that breaks this homongeneity
Parameter.semantic:
You are right. As the content is totaly free, this field cannot be assumed to give a role It must be seen as a secondary qualifier, something helping some clients. I think I should make it optional and the spec must be refined as well
Parameter.ucd:
Replaces VODML type (ie: expected Type of the ‘value’) Has the benefit of facilitating the use of concepts with no formally modeled Measure type; “phys.magField”, “phot.mag”.. I’ll note that I believe this is Markus’ argument for not having specialized Measure types at all, but only a single Measurement with a semantic tag to identify its nature (ala ucd).
I'm somewhere in between Markus and you: UCDs for roles but we need a model for the structuer of the measures
In my opinion, this form may be fine for a serialization, but is VERY difficult to specify dependencies/constraints in the models If ucd = “pos.eq” then associated Coordinate SpaceFrame MUST have referenceFrame=“ICRS|FK4|FK5” and Spherical coordinate space Has the vulnerability of being a consistency problem If ucd = “pos.eq” and the measure is “meas:Position but in GALACTIC”, the client will have to handle the inconsistency
It is true. I would say this is the cost for the flexibility. This problem arise each time the same thing has more the one identifier (UCD + dmtype here). Note that a classical model does not prevent this, it just shift the risk onto the mapping (you can map pos.eq on a galactical position)
If ucd = “phot.mag” and the measure is “meas:GenericMeasure”, the client STILL needs to do all the work to determine if the GenericMeasure content is compatible with “phot.mag” type. If they are doing that, then they can identify it as a “phot.mag” without the prompt. NOTE: doing this MAY mean drilling down to the VOTable element, and checking the UCD on the PARAM|FIELD.. noticing that it is “phot.mag” Having the ucd here does not solve the GenericMeasure problem, since it does not help identify dependent metadata If Parameter.ucd = “phot.mag” or “phot.flux” there should/must be an associated “photDM.PhotCal” instance.. how do they know that? where would they find it? This exact scenario is in the TimeSeries workshop use case.
Mango has no concept like (in)dependent metadata.
I'm not sure to follow you. Being inspired by the above code snippet you could do a checking this like this:
if generic_measure.get_coordsys()["@dmtype"] == "PhotometricSys":
print("this measure really looks like a photometric measure"
What can we do if the curator mixes up randomly data ucds and classes? Mango is a very flexible model designed to map various data, but it relies on the thoroughness of the data provider. This can be seen as a weakness, but to me, the benefit/cost ratio is more that positive.
Parameter.measure:
Is the parameter value, which may or may not be of the type identified in the ucd This can be a good thing ( qualifying GenericMeasure as “phot.flux” or “phys.magField” ) Or a consistency problem ( ucd=“pos.eq” with measure=Time )
This can easily be checked (see previous post)
There is only 1 option here.. Parameter contains Measure The model text describes that there are other kinds of parameters ( flags, assigned states, classifications ). By only having a Measurement option, the model has improperly extended Measure and Coordinate for these data. That will be another ticket, but I think there is work to do here on how to handle non-measure properties.
I admit that the way I extend Measure/Coordinate might look odd, but I claim it is valid. What I'm doing with flags is not that different of what you propose for the Polarimetry.
Mango needs an interface common for all measures (including flags, assigned states, classifications ). We can imagine an intermediate layer providing that interface for different category of measure, but what woul be the gain? I admit however that the term measure is not then better choice in this case. Nothing better found right now.
Proposal:
I would suggest splitting the Parameter into sub-classes Parameter: abstract parent. contains reference to associated parameter if that is needed (haven’t looked into that use case) PhysicalParameter: extends Parameter, contains Measure instance Classification: extends Parameter, contains a vocabulary literal (VocabularyTerm) removes need for VocabMeasure and VocabCoordinate which are not proper extensions of those models Flag: extends Parameter, contains what basically amounts to a user-defined enumeration value value = integer (OK to start, but in Chandra we have bit array flags where each bit represents a different issue ) options = pointer to what is currently defined as FlagSys Removes FlagCoord, FlagSys becomes local class as part of Flag Property spec, not extension of CoordSys
This may work, but I do no see the benefit of such a complication.
None of these would have ‘semantic’ or ‘ucd’ attributes to qualify the value.
There are thousand of different UCDs, we need them. The information carried by MCT classes is not enough.
In the PhysicalParameter we’d need to have a discussion on how to handle the complex unmodeled Measure types. The ‘simple’ ones, can be handled by clients interpreting units and/or the underlying VOTable element ucd.
Laurent,
The scope of Mango is to provide a way for clients to "understand" all properties attached to one given dataset
Paraphrased, I'd say the goal is "to model Source and its various Properties"
To work this around MANGO considers a source as an open set of properties
Right.. so at first level you have Source has a collection of Property-s
But then, when you look at the kinds of properties, there are at least 2 inherently different catagories
I don't think you should necessarily expect the interface to these to be the same..
MANGO API code block example
In my opinion, that thread should utterly fail for the 2nd type. It makes no sense to examine the coordSys or RefFrame or Errors of a MorphologyClass. They simply don't apply.
I admit that the way I extend Measure/Coordinate might look odd, but I claim it is valid.
I assure you that, Measurement was not intended to be extended in this way. Keep in mind that the Measurement model is designed to support the Cube case, which has a lot of parallels with this one. It too has Quality Flags and other sorts of qualifiers which are not covered by the Measurement model, because they are not within its scope.
What I'm doing with flags is not that different of what you propose for the Polarization (sp).
This is true.. I was very uncertain about including the enumerated PolarizationState in the Measurement model for this very reason.. it is technically not a measured entity, but an assigned state. I included it because
Given this discussion, I could be convinced to reconsider that choice.
quality flag discussion
The usage threads related to Source data often include evaluating the quality of, or usefulness of the Properties. These are externally determined and assigned to a particular Property (or Source record as a whole?). I assert that this is a more important relation than a simple 'Associated Property'. The flag has little meaning on its own, but is a qualifier on the Property to which it is assigned.
This is why I suggest the flag is not a Property itself (you aren't going to perform analysis on the Flag), and would be better modeled as an attribute on Property (maybe just PhysicalProperty) so that there is a common access point to this very important qualifier.
re: suggested restructuring of Parameter/Property and non-coordinate Measure-s "This may work, but I do no see the benefit of such a complication."
The suggested changes:
Paraphrased, I'd say the goal is "to model Source and its various Properties"
This rather a semantic shift. I prefer to keep my "model for source data" with an acronym standing for "Model for Annotating Generic Objects"
But then, when you look at the kinds of properties, there are at least 2 inherently different catagories
those based on physical entities, either measured or derived (Position, Time, Flux, Magnitude, HardnessRatio..). Whose values reside in a particular coordinate space, etc those which identify what kind of thing (Source) we have (SpectralType, CelestialClass, LuminosityClass, MorphologyClass). Whose values are are just an entry from a controlled vocabulary.
I don't think you should necessarily expect the interface to these to be the same..
Since there is no way for me to sell my generalisation of Syst/Frame, I would say that is a promising approach.
I feel it is more closely tied to a physical property than other classifications/flags
Polarization is clearly a physical property expressing a coordinate system that is a state enumeration. I'm completly at ease with this.
This is why I suggest the flag is not a Property itself (you aren't going to perform analysis on the Flag), and would be better modeled as an attribute on Property (maybe just PhysicalProperty) so that there is a common access point to this very important qualifier.
I don not agree. Either meaning, scope or cardinality of flags are too much flexible to consider them has an attribute with a predetermined role. I really prefer the actual semanticless association,
The suggested changes: I'll post a sketch this proposal in a new issue.
Sorry.. this is a long one.
The Source -> Parameter relation is very similar to the Cube model’s NDPoint -> Observable relation. In Cube, each Observable owns a Measure instance, and adds knowledge of whether this is ‘dependent’ or ‘independent’ data. But here, each Parameter ends up taking the place of VODML role and type from a formally modeled Source object.
ie: instead of Source.position:Position[*] we have Source with Parameter{semantic=“source position”, ucd=“pos.eq”}
I understand that this model is trying to be generic, and specifically NOT model Source explicitly, so I think the Source has a collection of Property-s is a good mechanism. But instead of this providing access to various types of Properties, it has become something that lets you build proxies for things which are not formally modeled, which I think is outside the model scope.
Parameter.semantic:
Parameter.ucd:
Parameter.measure:
Proposal: