ivoa-std / VOResource

Creative Commons Attribution Share Alike 4.0 International
0 stars 1 forks source link

Advertising a new version of a resource or a change of identifier #12

Open BaptisteCecconi opened 2 days ago

BaptisteCecconi commented 2 days ago

(The following use-case is based on a true story)

Let's say I'm a tool developer and I want to give access to a series of specific resources (e.g., specific epn-core tables), which I want use to provide an enhanced user experience on my interface. I want to keep track of these resources latest metadata from the registry, so I decided to store their ivo-id's and regularly query the registry to get, e.g., the TAP endpoint (which URL may change).

At some point, I realise that one of these ivo-id's has not been updated for a long time and the advertised TAP endpoint disappeared (due to bad management on the resource provider side). I didn't notice the issue, because my application didn't break. It just removed a feature, and the resource provider actually told me that his resources were not available anymore in my tool. It took me sometime to understand that the provider changed its naming authority and a new ivo-id should be used instead.

Discussion

About using the ivo-id as a handle to the resource: One can object that the tool provider should rather use the registry search interfaces to find the resource, but this is a risk, since the resource metadata can also change. The registry should be considered as the source of truth to access the resource.

There are several reasons (good or not, I won't judge) to change the naming authority or the ivo-id itself of a resource, like when moving a resource to another lab (following the scientist in charge of the resource), or when the lab/institution changes name (this can happen once in a while).

In such a case, there should be a way to advertise that an ivo-id is either a new version of a previous one (IsNewVersionOf semantics can be used), or just a new name for the same resource (not sure what semantics I should use for this).

In the case described, the tool developper kept the ivo-id, but the IsNewVersionOf relation would be attached to the new ivo-id. So, this doesn't really solve the problem.

So, I would rather think that the previous ivo-id, which should be set as "inactive", could then has a relation to the new ivo-id with a relation like "isReplacedBy" (or alike).

Any thoughts ?

msdemlei commented 2 days ago

On Wed, Oct 16, 2024 at 05:47:01AM -0700, Baptiste Cecconi wrote:

from the registry, so I decided to store their ivo-id's and regularly query the registry to get, e.g., the TAP endpoint (which URL may change).

For the record: I think that's a very reasonable thing to do. Even though ivoids are not designed as Pids, I think they do work just fine as ids, and as such they are often good enough, except...

his resources were not available anymore in my tool. It took me sometime to understand that the provider changed its naming authority and a new ivo-id should be used instead.

...when people change ivoids and nobody notices.

About using the ivo-id as a handle to the resource: One can object that the tool provider should rather use the registry search interfaces to find the resource, but this is a risk, since the

I agree you shouldn't do data discovery of this sort, except perhaps when resolving DOIs (or something similarly machine-readable). Other sorts of data discovery simply aren't stable enough when all you want is to pick up a specific resource.

So, I would rather think that the previous ivo-id, which should be set as "inactive", could then has a relation to the new ivo-id with a relation like "isReplacedBy" (or alike).

Any thoughts ?

I have wanted something like this for a long time. But there are a few technical problems.

First, current RegTAP says to ignore deleted records. This is bad here here because I'd rather avoid mentioning ivoids in rr.relationship that are not in rr.resource. However, that would be the case when someone changes their authority and hence deletes all old identifiers.

But perhaps we should put deleted records into RegTAP? Hm. Don't know. OAI-PMH at least has the option to make them temporary, i.e., discard them after a while. Also, they come with zero metadata; you just cannot attach something like author or title to them, so their rr.resource entries will look extremely irregular (irritating, even, to librarians at heart like myself).

But then perhaps it's ok if the ivoid in the related_id has no entry in rr.resource? Not good either, partly because IVOA Indentifiers (the spec) says that an ivoid must be resolvable, and RegTAP is the anointed way to resolve them. Then,

rr.relationship A JOIN rr.resource B ON (A.related_id=B.ivoid)

will lose these relationships, which feels like a nasty trap. And I also don't like that there's no way to figure out even the most basic metadata on such a resource.

So... I believe for the use case "provide a pointer to an updated resource" we ought to have an extra resource type, and the respective resources would live on and not be deleted.

This DiscontinuedRecord (say) type would really have minimal metadata; contact of course, but probably no creator. Probably a title, but we'd have to be careful to not create too much noise (it'd suck if these things came up during normal data discovery; we can always filter out certain resource types, but the fewer hacks of that type the better). Instead (or in the place?) of a description, these would have an explanation of what happened and why the resource is gone. The relationship would be more or less as you say, where I'd prefer an IsContinuedBy relationship_type.

I'd be happy to support folks who'd want to develop such a thing (including with a DaCHS implementation and as much wisecracking as you can stand); I don't think it's material for VOResource 1.2, though, and I don't think I'll push something like this along any time soon.

BaptisteCecconi commented 2 days ago

I like this idea of a DiscountinuedResource concept and the IsContinuedBy relationship would be fine.

You may have guessed that Pierre's email about our orphaned naming authority and ivo-id's is related to this issue. So we could start with this case as a demonstrator...

msdemlei commented 2 days ago

On Wed, Oct 16, 2024 at 06:55:16AM -0700, Baptiste Cecconi wrote:

I like this idea of a DiscountinuedResource concept and the IsContinuedBy relationship would be fine.

Ok... I'd say we ought to write a note (will you have time in Malta?) to explore what such a thing entails and whether there are hidden snares. Implementation on the publishing registry end would be fast then, I think.

BaptisteCecconi commented 2 days ago

I'll be in Malta from Thursday afternoon to Sunday Noon. Just coming for the IVOA. But I should be able to find time to work on this.