ivoa-std / DataLink

DataLink standard (DAL)
3 stars 6 forks source link

addition of templated endpoints mechanism #28

Closed Bonnarel closed 4 years ago

Bonnarel commented 4 years ago

We miss a mechanism to describe variable RESTful endpoints. This can be achieved by a templating mechanism described here. It's related to issue #27

mbtaylor commented 4 years ago

I'm not convinced this is a good idea in any case (is the additional complication justified by the benefits?), but if it is to be accepted, it needs to be much more carefully worded than

"The templating scheme adopted in the value attribute of this PARAM is following the appropriate IETF RFC \citep{std:RFC6570}."

RFC6570 is fairly complicated, and IMHO it is not a good idea to require DataLink parsing software to incorporate a full RFC6570 template parser. At the very least it's necessary to say which RFC6570 Levels are permitted, and to be explicit about how table content is turned into template variables. An example would also be a good idea.

Also, I don't understand why this PR pulls in provdm, provsap and provtap citations alongside RFC6570.

Bonnarel commented 4 years ago

Hi Mark, all

Le 30/10/2019 à 13:58, Mark Taylor a écrit :

I'm not convinced this is a good idea in any case (is the additional complication justified by the benefits?),

Use case exist. Take the Herschel Observation log. The Proposal column contains a string which is actually a fragment behind this root URL : http://herschel.esac.esa.int/Docs/KPOT/KPOT_accepted.html# Examples : 1 ) http://herschel.esac.esa.int/Docs/KPOT/KPOT_accepted.html#DDT_kjusttan_3 2 ) http://herschel.esac.esa.int/Docs/KPOT/KPOT_accepted.html#DDT_jcernich_10

This is typical of non parametric URL we would like to build with the column contents in a RESTful context. Current service descriptor doesn't allow such things.

but if it is to be accepted, it needs to be much more carefully worded than

"The templating scheme adopted in the value attribute of this
PARAM is following the appropriate IETF RFC \citep{std:RFC6570}."

This proposal was introduced by Laurent in answer to arguments telling that VOTAble (non normative) templating URL mechanism in the LINK element was IVOA specific and we should go to existing standards instead.

The initial proposal I made a year ago was to adopt the mechanism proposed in VOTable appendix in LINK.

This is actually how vizier solves this use case. Find below the FIELD with the Integrated LINK

Name of the proposal

RFC6570 is fairly complicated, and IMHO it is not a good idea to require DataLink parsing software to incorporate a full RFC6570 template parser. At the very least it's necessary to say which RFC6570 Levels are permitted, and to be explicit about how table content is turned into template variables.

I have nothing against restricting the mechanism to parts of RFC6570. Can we define those starting from the use cases we have (I remember Pat telling something about VOSPACE endpoints) ?

An alternative is to go back to the mechanism proposed in the VOTable annex and making it NORMATIVE but OPTIONAL.

If we do this I realize now that we can even avoid defining these PARAMS with utype "template" and value = "the templated thing" I introduced in the Pull request and directly create a LINK element per templated endpoint inside the SERVICE descriptor RESOURCE. VOTable xsd allows LINK elements to be integrated there instead of inside a FIELD or PARAM (usual practice)

An example would also be a good idea.

See above. but more may come

Also, I don't understand why this PR pulls in provdm, provsap and provtap citations alongside RFC6570.

Oops. I made a mistake in reinitializing the branches.

This modification of bib file is related to pull request #9. (additional use cases)

Cheers François

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ivoa-std/datalink/pull/28?email_source=notifications&email_token=AMP5LTG55T2HV4VLZM2WLDDQRGAHZA5CNFSM4JDNQKXKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECUCCYQ#issuecomment-547889506, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMP5LTFUUYCEBG6Q6BKYEUTQRGAHZANCNFSM4JDNQKXA.

msdemlei commented 4 years ago

On Fri, Nov 08, 2019 at 07:39:47AM -0800, Bonnarel wrote:

Le 30/10/2019 à 13:58, Mark Taylor a écrit :

I'm not convinced this is a good idea in any case (is the additional complication justified by the benefits?),

+1 on this lack of conviction.

Use case exist. Take the Herschel Observation log. The Proposal column contains a string which is actually a fragment behind this root URL : http://herschel.esac.esa.int/Docs/KPOT/KPOT_accepted.html# Examples : 1 ) http://herschel.esac.esa.int/Docs/KPOT/KPOT_accepted.html#DDT_kjusttan_3 2 ) http://herschel.esac.esa.int/Docs/KPOT/KPOT_accepted.html#DDT_jcernich_10

This is typical of non parametric URL we would like to build with the column contents in a RESTful context. Current service descriptor doesn't allow such things.

...and perhaps the "direct" service descriptor isn't the right place for this kind of thing. You see, it was originally intended as a quick-and-dirty shortcut to save clients a request to the actual datalink document for things like cutouts on spectra, and I've regretted its introduction quite often by now -- it just breaks too much logic (e.g., communicating parameter limits).

Anyway, I'd say the proper way to solve your use case is to have a datalink service and have the descriptor on the response point there.

From that service, you can easily serve the links; there is no limit whatsoever to how services can build URLs in there. This is preferable in many ways, in particular because clients get semantics and descriptions for the links in ways they are familiar with. The only cost is an extra HTTP request, and that's cheap these days (in particular if you speak HTTP 1.1).

I can only repeat that a full templating mechanism (including border cases, escaping, etc) is a complicated thing, and we shouldn't pull in such a monster just to save a single request (ok, per retrieved document in the worst case, but given that size(datalink) << size(associated document) in the general case, that's still low cost).

     -- Markus
Bonnarel commented 4 years ago

Hi Markus, all,

I am looking at your answer with Laurent.

Before answering about your ideas, we try to figure out if we understand well your points.

Please see below. Le 08/11/2019 à 16:54, msdemlei a écrit :

On Fri, Nov 08, 2019 at 07:39:47AM -0800, Bonnarel wrote:

Le 30/10/2019 à 13:58, Mark Taylor a écrit :

I'm not convinced this is a good idea in any case (is the additional complication justified by the benefits?),

+1 on this lack of conviction.

Use case exist. Take the Herschel Observation log. The Proposal column contains a string which is actually a fragment behind this root URL : http://herschel.esac.esa.int/Docs/KPOT/KPOT_accepted.html# Examples : 1 ) http://herschel.esac.esa.int/Docs/KPOT/KPOT_accepted.html#DDT_kjusttan_3 2 )

http://herschel.esac.esa.int/Docs/KPOT/KPOT_accepted.html#DDT_jcernich_10

This is typical of non parametric URL we would like to build with the column contents in a RESTful context. Current service descriptor doesn't allow such things.

...and perhaps the "direct" service descriptor isn't the right place for this kind of thing. You see, it was originally intended as a quick-and-dirty shortcut to save clients a request to the actual datalink document for things like cutouts on spectra, and I've regretted its introduction quite often by now -- it just breaks too much logic (e.g., communicating parameter limits). What do you mean by "direct" service ? Our current understanding is that you address service descriptors in standard services (TAP, SSA? SIA, etc...) responses ? Is that correct ?

Anyway, I'd say the proper way to solve your use case is to have a datalink service and have the descriptor on the response point there.

From that service, you can easily serve the links; there is no limit whatsoever to how services can build URLs in there. Do you mean build a fixed "root url" of the service and define several descriptors this way (one for each of the possible url you want to generate associated to a given main item) ? This is preferable in many ways, in particular because clients get semantics and descriptions for the links in ways they are familiar with. The only cost is an extra HTTP request, and that's cheap these days (in particular if you speak HTTP 1.1). Do you mean "first http request" for the DataLink service and seconf http request for the URL  generated from the descriptor ?

I can only repeat that a full templating mechanism (including border cases, escaping, etc) is a complicated thing, and we shouldn't pull in such a monster just to save a single request (ok, per retrieved document in the worst case, but given that size(datalink) << size(associated document) in the general case, that's still low cost). Regards François

-- Markus

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ivoa-std/DataLink/pull/28?email_source=notifications&email_token=AMP5LTC67BDZFAFHEJFD6BTQSWDTFA5CNFSM4JDNQKXKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDSQZII#issuecomment-551881889, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMP5LTE2E4TOK7PQLWP5AFDQSWDTFANCNFSM4JDNQKXA.

msdemlei commented 4 years ago

On Tue, Nov 12, 2019 at 07:32:44AM -0800, Bonnarel wrote:

Before answering about your ideas, we try to figure out if we understand well your points.

Le 08/11/2019 à 16:54, msdemlei a écrit :

On Fri, Nov 08, 2019 at 07:39:47AM -0800, Bonnarel wrote: ...and perhaps the "direct" service descriptor isn't the right place for this kind of thing. You see, it was originally intended as a quick-and-dirty shortcut to save clients a request to the actual datalink document for things like cutouts on spectra, and I've regretted its introduction quite often by now -- it just breaks too much logic (e.g., communicating parameter limits). What do you mean by "direct" service ? Our current understanding is that you address service descriptors in standard services (TAP, SSA? SIA, etc...) responses ? Is that correct ?

"Direct" means a service descriptor directly yielding the links, as opposed to the standard service descriptor pointing to the datalink service (standardID ivo://ivoa.net/std/DataLink#links-1.0).

Anyway, I'd say the proper way to solve your use case is to have a datalink service and have the descriptor on the response point there.

From that service, you can easily serve the links; there is no limit whatsoever to how services can build URLs in there. Do you mean build a fixed "root url" of the service and define several descriptors this way (one for each of the possible url you want to generate associated to a given main item) ?

No. You put your links into a normal datalink table. The original service response just has the one descriptor for what the current standard calls a {links} capability.

preferable in many ways, in particular because clients get semantics and descriptions for the links in ways they are familiar with. The only cost is an extra HTTP request, and that's cheap these days (in particular if you speak HTTP 1.1). Do you mean "first http request" for the DataLink service and seconf http request for the URL  generated from the descriptor ?

Well, the first HTTP request fetches the datalink document (it goes "to the {links} capability"). The second, third, etc then pull the actual items linked in there with the full metadata in you have in datalink tables (which, frankly, I consider the main advantage of that approach and the main reason to avoid "direct" descriptors).

     -- Demi