ivoa-std / SODA

Server-side Operations for Data Access
Creative Commons Attribution Share Alike 4.0 International
1 stars 4 forks source link

Managing multiple PARAMS queries responses #6

Open Bonnarel opened 4 years ago

Bonnarel commented 4 years ago

When several of the PARAM queries (POS, BAND, etc...) are multiple we are facing a cartesian product issue : how do we retrieve all these results. Seems to be easy in async mode which is built to manage multiple results ? But in sync ? MEF formats ? archive files ? VOTable contaning pointers ? What about using UPLOAD functionality (DALI, 3.4.5 UPLOAD) for this ?

pdowler commented 3 years ago

The current spec allows the service implementation to reject multiple params so that provider can put a limit on how complex output should be. At CADC we limit SODA-sync to single value for each param and we limit SODA-async to a single ID value but allow multiple "cutout" values and perform the cartesian product.

It will always be complex and require arbitrary packaging for sync to support multiple params. The CADC implementation could allow it and produce MEF files but it introduces a subtle complexity that I didn't really like: if you do 2 POS cutouts from a simple FITS file, you could implement that do do anyo one of these:

  1. create an MEF with one extension per POS, but the you have to fabricate a sensible primary header (cannot just keep all the cards in the original one)... conceptually I didn't like simple to mef
  2. create another simple mef with a sparse data array and both POS(s)... but that is much larger than necessary if the two POS values are ~far apart and doesn't give the caller what they asked for: two cutouts
  3. create a package (eg tar) with multiple simple fits files in it)... have written code to create tar on the fly and CADC could do this, but then you get in the situation where some requests to SODA-sync return FITS and others return TAR. Plus on-the-fly package has all kinds of operational complexity (files stored in different locations because of a distributed storage arch is the one we have and don't like at all)

I'm sure there are more ways to do it with their pros and cons. Mostly you probably end up with different behaviour from different providers and that's bad for users. So, my feeling now is that SODA-sync should have explicitly only described/specified a single cutout (one value of each cutout param). Since SODA-async has the facility to manage multiple results I don't think inventing a way to do it in SODA-sync adds much.

pdowler commented 3 years ago

As for the cartesian product issue, if we want to enable users to finesse the way orthogonal params are combined we could easily define an upload table with a column for each param; pretty much just have to say that VOTable FIELD name = param name and everything else falls out. Such a table would be (in UWS parlance) a "job description language" and could be handled as such rather than in the style of an UPLOAD param. TBD.

molinaro-m commented 2 years ago

At INAF, for the "vialactea" requirements, we're facing the issue to have an async endpoint allowing multiple IDs each with it set of POS (might also be BAND) cutouts. This would require the above VOTable solution POST-ed and we're going for a tar job response including a json description (completely custom). Could the experiment be used to prototype a SODA feature= Or would that be better contained in a custom extension?

Bonnarel commented 2 years ago

Le 24/01/2022 à 16:50, Marco Molinaro a écrit :

At INAF, for the "vialactea" requirements, we're facing the issue to have an /async/ endpoint allowing /multiple IDs/ each with it set of POS (might also be BAND) cutouts. This would require the above /VOTable/ solution POST-ed and we're going for a tar job response including a json description (completely custom). Could the experiment be used to prototype a SODA feature= Or would that be better contained in a custom extension?

Hi Marco,all,

 I think if you already have implemented it and expose it in your service it would be the best way to propose it as a new feature in SODA-next

Cheers

François

— Reply to this email directly, view it on GitHub https://github.com/ivoa-std/SODA/issues/6#issuecomment-1020243249, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMP5LTCADKNDOBHV5ZNSVW3UXVYMFANCNFSM4M7YG7ZQ. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID: @.***>

molinaro-m commented 2 years ago

Good, implementation is on its way, I/we'll keep details posted.

Bonnarel commented 1 year ago

Good, implementation is on its way, I/we'll keep details posted.

Any news on that implementation ?

Bonnarel commented 10 months ago

Any news on this implementation @molinaro-m Marco ?

molinaro-m commented 10 months ago

Hi François,

sorry for writing you privately on this.

I asked Robert to comment upon, he has the technical details on this.

Cheers Marco

Il giorno ven 10 nov 2023 alle ore 01:00 François Bonnarel < @.***> ha scritto:

Any news on this implementation Marco ?

— Reply to this email directly, view it on GitHub https://github.com/ivoa-std/SODA/issues/6#issuecomment-1804862708, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF2PZ6BOWDUBUGBO6TZMOQTYDVVANAVCNFSM4M7YG7Z2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOBQGQ4DMMRXGA4A . You are receiving this because you commented.Message ID: @.***>

-- Marco Molinaro INAF - Istituto Nazionale di AstroFisica Osservatorio Astronomico di Trieste email @.*** tel. [+39] 333 33 20 564 [also Telegram]

robertbutora commented 10 months ago

Hey, here is a brief:

by request of the client-developers, we settled on JSON-based JDL: we use JSON-array (which by definition preserves order) to collect input parameters - in our case SODA - to be sent to the server. Response must always have the same length as the request had, so index is used to associate request param-set to the result-set. Let me know if more details needed. Robert