ivoa / dm-usecases

The is repo gathers all the material to be used in the DM workshop 2020
The Unlicense
1 stars 3 forks source link

Role of the roles #45

Open lmichel opened 3 years ago

lmichel commented 3 years ago

This issue is related to the @msdemlei proposal. In ts.vot, I'm wondering how a client can get the role of a mapped object (or does it wish to?).

Lets take an easy example line 79:

      <INSTANCE ID="ndwibspodost" dmtype="ndcube:Cube">
      ...
      </INSTANCE>

The client can read the dmtype and guess that this mapping bock contains a cube. But what happens if the data set contains 2 ndcube:Cube. (contains 2 positions is more realistic I admit)

Let's take another example more confusing to me, line 53

<TEMPLATES>
      <INSTANCE ID="ndwibspapabt" dmtype="stc2:Coords">
      ...
      </INSTANCE>

This object is typed as a stc2:Coords which looks like an abstract class. No way to know which role does this instance play. If I enter it, I can see 2 un-typed attributes (time and space) but this does not tell me more about what to do with it.

msdemlei commented 3 years ago

On Wed, Apr 28, 2021 at 07:39:55AM -0700, Laurent MICHEL wrote:

This issue is related to the @msdemlei proposal. In ts.vot, I'm wondering how a client can get the role of a mapped object (or does it wish to?).

Lets take an easy example line 79:

      <INSTANCE ID="ndwibspodost" dmtype="ndcube:Cube">
      ...
      </INSTANCE>

The client can read the dmtype and guess that this mapping bock contains a cube. But what happens if the data set contains 2 ndcube:Cube. (contains 2 positions is more realistic I admit)

You have two INSTANCE elements.

  • How to distinguish them?

As far as the hypothetical ndcube DM is concerned, they are "equivalent" in the sense that both are just instances of Cube. Other annotation might make them non-equivalent, of course.

But the question leads to another case I'd like to see addressed in essentially all of our models and that's closely related to actually using these models.

Suppose TOPCAT sees several positions in a table, and the user asks for a sky plot. TOPCAT can now, without guessing, use one set of coordinates, but it would probably want to give the user a widget to select a different set of coordinates. That widget could have labels

"Coord in ICRS #1/Coord in ICRS #2/Coord in Galactic #1"

with what we have right now; Mark just doesn't now more about these guys.

However, I'd say it would be much preferable to have labels, say,

"Short-term solution/Long-term solution/Galactic".

What we'd need for that is a string-valued "label" attribute on many of our classes (really, any that a user might need to choose from, and probably any that may profit from having a few words of explanation).

Can I post a meta-issue on this somewhere?

  • Are you sure you can always get ride of dmroles on top level instances?

A role is a relationship between a parent and a child within a certain data model. You could say that root elements have "entire document" as an implicit parent, but then you'd have to have an "entire dataset" model that would give you what role names such top-level instances could have. To me, that feels wrong.

I'm quite sure that whereever you need to define relations between the whole document and something else, it's much better to define a dedicated model for that particular function. Within instances of these models' classes (probably singletons, then), you have the roles in attributes as normal.

Let's take another example more confusing to me, line 53

<TEMPLATES>
      <INSTANCE ID="ndwibspapabt" dmtype="stc2:Coords">
      ...
      </INSTANCE>

This object is typed as a stc2:Coords which looks like an abstract class. No way to know which role does this instance play.

(no, Coords is't abstract; why would it?) But that you don't now the role is by design, and it's important, because that lets you un-entangle DMs. This instance could be a target position in a dataset1 instance, also target position in a dataset2 instance, a center of mass in a hypothetical extra DM the data provider defined themselves, an initial center position in a DM giving preferred plot options, and so on.

This is just like in Python and similar languages: It is not the values that have names, there are named references to the values -- and there can be as many of those as necessary to get the job done, and they can come from as many different namespaces as necessary.

If I enter it, I can see 2 un-typed attributes (time and space) but this does not tell me more about what to do with it.

  • Do you assume that the client has not to be informed on the role?
  • Did I miss some key point?

The time and space attribute names would of course be defined in the DM (this is organised a bit like STC1). The DM docs is where the client writers go at implementation time to figure out how to map incoming data to their data structures.

And yes, the type information is not in the attributes, it is in the values, i.e., the INSTANCEs (again, that's exactly as it is in Python and its ilk).

lmichel commented 3 years ago

Can I post a meta-issue on this somewhere?

Please open a new issue

lmichel commented 3 years ago

What we'd need for that is a string-valued "label" attribute on many of our classes (really, any that a user might need to choose from, and probably any that may profit from having a few words of explanation).

This is the role of the semantic field on Mango parameters

lmichel commented 3 years ago

This instance could be a target position in a dataset1 instance, also target position in a dataset2 instance, a center of mass in a hypothetical extra DM ...

Ok but if both (target and CoM) are present in my dataset, how can I know which Coords relates either target or CoM?

lmichel commented 3 years ago

The DM docs is where the client writers go at implementation time to figure out how to map incoming data to their data structures.

If I understand well, your annotation does not match the structure of the model elements and we need a documentation telling to the developers how to bind annotation elements with model classes. I allow myself to notice that writing such a documentation looks like (meta/cross?) modeling.

msdemlei commented 3 years ago

On Thu, Apr 29, 2021 at 09:59:22AM -0700, Laurent MICHEL wrote:

The DM docs is where the client writers go at implementation time to figure out how to map incoming data to their data structures.

If I understand well, your annotation does not match the structure of the model elements and we need a documentation telling to the developers how to bind annotation elements with model classes.

No, no, not at all. Of course the annotation directly reflects the model structure -- which is the reason I'm so worried about entangling data models, and why I've moaned about some less practical aspects of the current models (e.g., in time0 and in particular the per-physics classes currently envisioned in Meas).

Perhaps stating the obvious: A validator would look at a root instance's dmtype, map the prefix to the latest version of the corresponding DM, and could then match the model's requirements against what it finds in the document. It would then report the validity per instance.

I admit, though, that I expect most clients will ignore the VO-DML and just do the mapping to their internal data structures based on the (generated) documentation.

I allow myself to notice that writing such a documentation looks like (meta/cross?) modeling.

  • What is the rationale for this?
  • Why do your annotations not match the structure of the (un-entangled) model classes.

I've just used hypothetical sanitised DMs because I wish we'd have them. And I've not written the DMs because I don't like the current tooling (though I've always wished I found time to adapt Paul Harrison's toolchain to what I want).

One could of course do the annotations based on the current Coords and Meas -- as I claim the five-element extension can fully express VO-DML instances -- , but since I pulled my example directly from live DaCHS, that would have been quite a bit of work, which, given I don't much like the models, I didn't do.

msdemlei commented 3 years ago

On Thu, Apr 29, 2021 at 09:49:23AM -0700, Laurent MICHEL wrote:

This instance could be a target position in a dataset1 instance, also target position in a dataset2 instance, a center of mass in a hypothetical extra DM ...

Ok but if both (target and CoM) are present in my dataset, how can I know which Coords relates either target or CoM?

I'm not quite sure what you are asking here -- could you re-phrase the question a form like "A clients wants to do X, how does it do it?".

In general, however, the workflow will in general be something like "Ah, this Dataset's target position in in the RA and Dec parameters -- can I find a Coord annotation I undestand for any of those?"