Open gilleslandais opened 1 week ago
On Wed, Nov 13, 2024 at 02:31:09AM -0800, gilleslandais wrote:
Are there good practices to expose evolving dataset?
As far as the Registry goes: probably not. At least I am not aware of any.
For instance a tag ? a version ?
I think VOResource's version for now is intended for releases. Perhaps a pragmatic an simple solution is to say "if you have continuous updates, append '-updating' to your version"?
VizieR provides some evolving catalogues (log of observation, Obscore table that gather spectra/images coming from articles).
At GAVO, we have quite a few of those, too. So, I'd be happy to prototype anything, too.
These datasets are only available in their last version. For reproducibility reason, the information is useful to users.
For reproducibility (in the sense of: "be aware that the same query might different results later") I think it would be enough to say "Please include 'this resource is updated on a timescale of [days|weeks|months]' in the description". I don't think that usecase needs machine readability.
Are there other use cases that require mutability information machine-readably? Gilles and I yesterday have speculated about components that harvest substantial parts of the VO that could skip immutable resources on re-queries. Is that a thing anyone would want?
If we decide we do want to go there, I see a few options for machine-readable declaration of mutability:
If it's the latter, we ought to hurry, as VOResource 1.2 (where this could go in) is just entering RFC.
About evolving datasets.
Are there good practices to expose evolving dataset? For instance a tag ? a version ? VizieR provides some evolving catalogues (log of observation, Obscore table that gather spectra/images coming from articles). These datasets are only available in their last version. For reproducibility reason, the information is useful to users.
eg: