Rosetta - Githubissues

Erard commented 1 year ago

"Rosetta" won't find the value used in ESA / NASA archives, this is of course annoying: "international rosetta mission". This should be included in the alias list

Erard commented 1 year ago

As a general rule, all values of instrument_host_name in the PDS / PSA should be included

BaptisteCecconi commented 1 year ago

Where can we find the list ? (to help @LauraD12)

Erard commented 1 year ago

No idea. An entry point is https://pds.nasa.gov/tools/dd-search/ There used to be a doc providing a list of value for PDS3 - Google is your friend

BaptisteCecconi commented 1 year ago

The PDS dd-search interface is a first good step. We probably need the same for ESA and JAXA at least.

Erard commented 1 year ago

They should all use the same keywords / values - except perhaps for PDS3 archives stored only outside the US. One matter is the use of mission_name vs instrument_host_name in PDS3 (the latter being a submodule, eg lander / orbiter, but also individual telescopes). Unsure if we want only mission_name or both - need to get back to our old notes.

Erard commented 1 year ago

Another source of info that should be implemented is the PSA doc (from p93): https://www.cosmos.esa.int/web/psa/psa-user-guide

BaptisteCecconi commented 1 year ago

The list is available on page 9 of that document (@Erard, you confirm, this is the list you have in mind?) and is the following (and I put the URL to the resolver):

GIOTTO: https://voparis-elasticsearch.obspm.fr/obsfacility/resolve?q=GIOTTO (obs-facility term = giotto)
SMART-1: https://voparis-elasticsearch.obspm.fr/obsfacility/resolve?q=SMART-1 (obs-facility term = smart-1)
VENUS EXPRESS: https://voparis-elasticsearch.obspm.fr/obsfacility/resolve?q=VENUS%20EXPRESS (obs-facility term = venus-express)
MARS EXPRESS: https://voparis-elasticsearch.obspm.fr/obsfacility/resolve?q=MARS%20EXPRESS (obs-facility term = mars-express)
HUYGENS: https://voparis-elasticsearch.obspm.fr/obsfacility/resolve?q=HUYGENS (obs-facility term = huygens)
ROSETTA: https://voparis-elasticsearch.obspm.fr/obsfacility/resolve?q=ROSETTA (obs-facility term = rosetta)
EXOMARS 2016: https://voparis-elasticsearch.obspm.fr/obsfacility/resolve?q=EXOMARS%202016 (obs-facility term = exomars)
BEPICOLOMBO: https://voparis-elasticsearch.obspm.fr/obsfacility/resolve?q=BEPICOLOMBO (obs-facility term = bepicolombo)
HUBBLE: https://voparis-elasticsearch.obspm.fr/obsfacility/resolve?q=HUBBLE (obs-facility term = hubble-space-telescope)

NB1: I excluded the last one: GROUND-BASED NB2: for EXOMARS 2016, the resolver should be improved to have the correct name in the first rank in the resolver.

Erard commented 1 year ago

p.9 is the short version. On p 93 an extended version provides:

mission_name
instrument_host_name (different from mission, eg lander / orbiter)
instrument_host_id

I think we want all these values, although they are those used for CQL queries (not necessarily = PDS values from what I see)

In addition, it gives a list of instruments in the PSA, which is nice to have

Erard commented 1 year ago

NB2: for EXOMARS 2016, the resolver should be improved to have the correct name in the first rank in the resolver.

It is much worse for Mars Express ;(

BaptisteCecconi commented 1 year ago

I think there is a misunderstanding... The goal of the resolver is to propose a ranked list of results. If the first item of the list of the right one, I considered it is a success. So for Mars Express, there are plenty of results, but the first one is correct. This is not the case for Exomars.

Erard commented 1 year ago

Well, yes and no: it only works if no other key provides it as a first item - otherwise we can't use this to resolve aliases. Meaning that we need to check the first item of every entry

BaptisteCecconi commented 1 year ago

This is the same for SSODNET name resolver, from a user input, you get a list of names, and the user selects the correct name.

We may be able to refine the resolver ranking algorithm, but first we want to make sure that the first result is correct. Then we will refine.

Erard commented 1 year ago

Not really: in the portal SSODnet is used to return all known aliases, which are included in a single ADQL query. In my opinion this is the main point in having a resolver. Plus, if we only use the first item, we need to be able to disable the conversion: If I'm asking for international-Rosetta-mission and get Rosetta I often need to overwrite it - otherwise I'd never find the instrument archive. And again, the added value is debatable.

BaptisteCecconi commented 1 year ago

There is a real misunderstanding... (or I completely miss the point...).

There are two prototype queries on the obs-facility database:

the resolve?q= query, which should be used by the providers to find the term to put in the epn_core table, or by the user to find the term to be selected in the search interface.
the aliases?label= query, which is used to find all the known aliases for a term.

The aliases?label= query has to be used now, since the providers are not using the standard terms. However, eventually, we may (should) impose to use the terms from the obs-facility vocabulary, and then the aliases?label= query will not be so useful anymore.

Erard commented 1 year ago

We certainly need a real discussion to clarify the objective of this activity - short anwser for now:

the alias query is very noisy at this point, that was my concern
I don't think we can impose a main value to data providers - especially space agencies or telescopes

The situation is identical to target_name: each object has many names which are all relevant and legitimate in their context. What we need for global / blind queries is a set of equivalent values for the same concept/object in various contexts. The only way out is to work on a centralized db of metadata (ElasticSearch) and convert all values to a unique string. But this would only work for EPN-TAP, not the whole VO.

epn-vespa / FacilityList

Rosetta #2