ivoa-std / DataLink

DataLink standard (DAL)
3 stars 6 forks source link

How to control consistent limits in the number of retrieved links ? #45

Closed Bonnarel closed 3 years ago

Bonnarel commented 4 years ago

Markus wrote

" If the client submits more ID values than a service is prepared to process, the service should process ID values up to the limit and must include an overflow indicator in the output as described in DALI. The service must not truncate the output within the set of rows (links) for a single ID value if the request exceeds such an input limit."

Control by MAXREC or by telling the client there is an overflow ?

<INFO name="QUERY_STATUS" value="OVERFLOW"/>

None of these are in 1.0. Neither in 1.1 at the moment Mark Taylor seems to be OK for QUERY_STAUS OVERFLOW Alberto and ESO have another solution for tuning the number of output lines

Thoughts ?

msdemlei commented 4 years ago

On Fri, May 08, 2020 at 01:48:59PM -0700, Bonnarel wrote:

Markus wrote

" If the client submits more ID values than a service is prepared to process, the service should process ID values up to the limit and must include an overflow indicator in the output as described in DALI. The service must not truncate the output within the set of rows (links) for a single ID value if the request exceeds such an input limit."

For the record, that's not my text, that's Datalink REC-1.0, p. 10. The context is this thread on the DAL list: http://mail.ivoa.net/pipermail/dal/2020-March/008318.html.

Control by MAXREC or by telling the client there is an overflow ?

<INFO name="QUERY_STATUS" value="OVERFLOW"/>

None of these are in 1.0. Neither in 1.1 at the moment Mark Taylor seems to be OK for QUERY_STAUS OVERFLOW Alberto and ESO have another solution for tuning the number of output lines

Thoughts ?

I think I agree with Mark's assessment in the cited thread: QUERY_STATUS=OK and QUERY_STATUS=ERROR aren't useful in Datalink, and hence we shouldn't put them in. Also, it doesn't seem there's much of a place for MAXREC in Datalink.

Hence, I think there's no immediate need for changes in the spec text, let alone the spec content from this issue.

One might argue that writing something like:

No QUERY_STATUS INFOs with values other than OVERFLOW should be produced by datalink services.

That's probably benign, since we can't change the overflow indication in DALI anyway when it is directly referenced by implemented standards, and thus we can hard-code QUERY_STATUS here, too. It would perhaps have saved me a bit of bafflement. On the other hand: has this ever baffled anyone else? And so badly as to justify more spec text?

Similarly, perhaps it is worth saying somewhere that DALI MAXREC doesn't apply to Datalink, but I couldn't say where that text would fit without seeming odd itself.

So... I think my vote would be for closing this issue without action.

pdowler commented 4 years ago

Clarify text to make it clear that OVERFLOW must be included if result truncated.

Zarquan commented 4 years ago

As far as I can tell, the specification doesn't say the results should be ordered by ID? In which case, if the client sends multiple IDs and the server responds with OVERFLOW, the client won't know which ID's are complete and which were truncated.

pdowler commented 4 years ago

The output certainly needs to be grouped by ID, but not absolutely ordered. Subtle difference but should be clarified.

pdowler commented 4 years ago

In retrospect, we could have gone without a MAXREC param entirely. It's purpose in client output control is mainly for quick/small response (small MAXREC) which can be done by using a single ID value. The other extreme use case (for large output) is lower overhead when there are many IDs to process.

MAXREC=1&ID=foo has the same output with and without the MAXREC: all the links for foo

In neither case is a client specified MAXREC=1000 ever really useful; not sure if it actually makes implementation harder or not, but it makes the spec trickier than necessary.

The service still needs to be able to truncate output (w/ OVERFLOW) if it has implementation reasons to do so.

mbtaylor commented 4 years ago

I think we did go without MAXREC didn't we? I don't see a reference to it in the DataLink text. It only appears in this issue as speculation about what we might do (and nobody seems keen to have it). Unless I'm missing something...

pdowler commented 4 years ago

Yes, you are right. In that case the spec is pretty much fine as is, except that "must include an overflow indicator in the output as described in DALI" apparently needs to be spelled out more explicitly.

On second reading (of DALI), the overflow indicator in DALI is unnecessarily coupled to "exceeds MAXREC" when in practice services can have limits and MAXREC is a way for the client to (try to) influence the limit. I'll make a note to clarify that in DALI as well.

pdowler commented 3 years ago

I volunteer to write this and create a PR

Bonnarel commented 3 years ago

This is great !!

Thanks

Le 02/09/2020 à 15:26, Patrick Dowler a écrit :

I volunteer to write this and create a PR

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ivoa-std/DataLink/issues/45#issuecomment-685735850, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMP5LTBRYCWJ4S3HL6ZC4OLSDZB2HANCNFSM4M4OQBVQ.