ivoa-std / ConeSearch

Simple Cone Search
Creative Commons Attribution Share Alike 4.0 International
2 stars 4 forks source link

main ID usage #53

Open molinaro-m opened 3 years ago

molinaro-m commented 3 years ago

REC-ConeSearch-1.03 states that in the response table:

Exactly one FIELD must have ucd="ID_MAIN", with an array character type (fixed or variable length), representing an ID string for that record of the table. This identifier may not be repeated in the table, and it could be used to retrieve that same record again from that same table.

That is, a main identifier is required in the response and this request's goal is to have an handle on the specific record (not necessarily unique) in the catalogue.

However catalogue providers might have different use case for identifiers:

Should we modify the main ID behaviour in ConeSearch (while moving to UCD1+, see issue #20)?

Back compatibility will still require to have one (and only one) FIELD with ucd="meta.id;meta.main" (going for the easiest mapping on UCDs), but maybe some other solutions, based on practical use case, can be considered.

gilleslandais commented 3 years ago

VizieR feebacks on ID_MAIN usage.

agree that ID_MAIN are assigned for a unique column in a table. But in VizieR the ID_MAIN doesn't imply a unique identifier in the column (it is not a primary key).

In fact, there is an ambiguity with ID_MAIN semantic used in conesearch and in the UCD definition. The UCD definition defines ID_MAIN as "Main Identifier of a Celestial Object" (table having several lines for a same object is possible).

So considering the current conesearch, there are conesearch which are not compliant due to this ambiguity (each table in VizieR having positions is served by a conesearch)- Moreover, the type of ID_MAIN columns are not always a string (ex: Gaia). ..but in an other hand, these columns are compliant with the UCD definitions.

Today the ID_MAIN columns of VizieR conesearch contain only identifiers to get one or more lines which matches the same object name. These identifiers are not universal because they could be a technical id.

msdemlei commented 3 years ago

On Tue, Feb 16, 2021 at 09:39:47AM -0800, gilleslandais wrote:

In fact, there is an ambiguity with ID_MAIN semantic used in conesearch and in the UCD definition. The UCD definition defines ID_MAIN as "Main Identifier of a Celestial Object" (table having several lines for a same object is possible).

Requiring ID_MAIN to be unique has some use cases that I think are at least worth considering; off the top of my head:

Not overly convincing (and perhaps this kind of thing should be in the column annotation rather than at the protocol level), but we should at least think a bit before dropping the requirement.

On the other hand, I don't think any existing client would break if we dropped it.

Moreover, the type of ID_MAIN columns are not always a string (ex: Gaia). ..but in an other hand, these columns are compliant with the UCD definitions.

Well, it's relatively trivial to just stringify such integer columns when returning SCS results even when the database has a different type (DaCHS does that generically).

Whether that's worth it is another matter. I've always wondered why the SCS authors have introduced that requirement -- does anyone know?

gilleslandais commented 3 years ago

I just see that there aren't any meta.main;meta.id in ObscoreDM - obs_id for example identifies an observation but not necessary a unique record (ObsCore-v1.1, section 3.3.3) so , if we consider that the main identifiers in conesearch consists to link the original dataset, I am not not convinced about unicity - ObsCore table is an example

An other point - may be out of context - what do you think about meta.record instead of meta.main;meta.id ? meta.record sounds indeed as a technical identifier (to link a table) as "meta.id;meta.main". But the second could be also considered as a known identifier like a "universal" identifier , a IAU name , a Jname or ..

msdemlei commented 3 years ago

On Wed, Mar 03, 2021 at 07:48:47AM -0800, gilleslandais wrote:

I just see that there aren't any meta.main;meta.id in ObscoreDM - obs_id for example identifies an observation but not necessary a unique record (ObsCore-v1.1, section 3.3.3) so , if we consider that the main identifiers in conesearch consists to link the original dataset, I am not not convinced about unicity - ObsCore table is an example

Hm... is cone search on an obscore table a use case for SCS? Why would one want to do that?

An other point - may be out of context - what do you think about meta.record instead of meta.main;meta.id ? meta.record sounds indeed as a technical identifier (to link a table) as "meta.id;meta.main". But the second could be also considered as a known identifier like a "universal" identifier , a IAU name , a Jname or ..

Well, that kind of change is, I'd argue, definitely material for SCS 2 (if that ever happens).

As I said, I guess whether or not we want that strictly depends on whether there's a use case for requiring a primary key on what's returned from SCS. The use cases I can imagine are:

Since I don't consider these terribly convincing, I'd be open to dropping the uniqueness requirement in SCS 2 (until someone comes up with more convincing use cases at least).