ivoa-std / DALI

DALI: Data AccessLayer Interface
Creative Commons Attribution Share Alike 4.0 International
1 stars 5 forks source link

new xtypes: range, hms, dms, uri, uuid #24

Closed pdowler closed 1 year ago

pdowler commented 1 year ago

removed multipolygon from shape added range to shape for SIA and SODA compat

pdowler commented 1 year ago

still need to get the RFC 4122 citation included

msdemlei commented 1 year ago

On Tue, Jun 06, 2023 at 02:34:43PM -0700, Patrick Dowler wrote:

@pdowler commented on this pull request. As for shape also including moc, I have not implemented that and so while it sounds logical I can't say what's involved in doing a good job of that. I do think moc and other shapes are for

It's orders of magnitude simpler than multipolygon, at least if you're using pgsphere (and I'd so much wish we'd dump multipolygon in favour of MOCs).

different purposes so it's not clear if there is a compelling use case at this time.

As soon as I have more complicated shapes in my obscore table, I, for one, will almost certainly want to move to MOC (right now, I only have polygons and turn circles (e.g., spectra with aperture) into approximating polygons).

If shape did include moc then an upload table could include circles, polygons, and mocs and a service would be trying to put

Frankly, the only way I can see myself implementing anything like shape in an upload table is through MOCs, in particular if we allow multipolygon. That's why I'm so after loosening the requirement on roundtripping geometries (i.e., I want to legally return a MOC when I get some geometry in).

I'd do that. I think it also likely that someone who doesn't support moc would have "partial shape support" and since shape is also the xtype to describe the SIA and SODA POS param that would probably introduce confusion and pain. So I would not want to add

I'd really be curious how you do uploads of shape columns anyway... If it's through the famous hidden columns then please let's not do it. They're an extremely painful kludge that'll break left and right in dozens of corner cases.

things to such a polymorphic xtype and what that means. We need polymorphism, but it's hard and we shouldn't go overboard :-)

MOCs FTW. But could you quickly point me to the use case(s) of shape in upload tables? Perhaps I can think of a less horrible alternative...

pdowler commented 1 year ago

use case for shape as input:

SIA and SODA POS param.

ADQL REGION function string arg should refer to DALI shape.

People can upload table of shapes now in TAP, which would just be a string, but once xtype="shape" is defined in DALI they have some expectation that it will behave like a geometry (in query). Presumably we have to add something to one or more registry extensions so services can say which xtypes they support (or require specific error messages for unsupported input data types)... so that applies to TAP at a minimum.

It was always the plan to add table upload support to SIA (now DAP), where one could have a column of geometries and I expect refer to the column with a query param (details TBD). The POS param is xtype="shape" (as currently specified in this PR and with the addition of range) so that would immediately lead to an expectation of upload tables with a shape column in there. Right now, with multiple occurrences of POS param, implementations can plausibly just do that inline in the SQL but the whole point of upload table would be to scale up the list such that inline wasn't the right solution (my opinion).

In general, not everyone uses postgresql and pgsphere so we can't hang everything on moc (or any other specific type) and we have to allow service implementers to say "that bit is to hard to do in my org-mandated db server". Maybe we should specify a standard string to put in error messages, eg

unsupported-xtype: multipolygon [optional extra detail here]

so clients can reasonably figure out what to look for.

almicol commented 1 year ago

Hi Gregory, Markus, and Pat,

While I still have to study DALI “multipolygons” and the new ADQL version (sorry I’m late for that, but I still hope to be able to give comments before June 20th), I can just say that at ESO we need points, polygons, multipolygons, and circle, so “shape” is the polymorphism we need.

Regarding MOC, the ESO databases (MS SQLServer, SAP IQ) do not support them.

Regarding use cases to upload shapes, I can see 2 use cases:

  1. A user interested in finding counterpart of a gravitational wave could either upload a multipolygon or a MOC describing the GW credible region to find ObsCore products that could have observed the given GW event.
  2. A user wants to cross-correlate ObsCore s_region footprints of observed/calibrated datasets (some are circle, some are multipolygon) with a different TAP service offering catalogs, to see which catalog sources match the given footprints.

Off now for almost a week (in between a champions league soccer match to attend), cheers, Alberto

From: Grégory Mantelet @.> Reply to: ivoa-std/DALI @.> Date: Friday, 2. June 2023 at 16:00 To: ivoa-std/DALI @.> Cc: Alberto Micol @.>, Mention @.***> Subject: Re: [ivoa-std/DALI] new xtypes: range, hms, dms, uri, uuid (PR #24)

This email was sent from outside of ESO from the address @.. If it looks suspicious, please report it to @.

@gmantele commented on this pull request.


In DALI.texhttps://github.com/ivoa-std/DALI/pull/24#discussion_r1214420019:

@@ -866,19 +909,28 @@ \subsubsection{Shape}

Shape values serialised in VOTable or service parameters must have the following metadata in the

\xmlel{FIELD} element: \verb|datatype="char"|, \verb|arraysize="*"|, \verb|xtype="shape"|

(where arraysize may also be fixed length or variable length with limit).

-The value is a polymorphic shape made up of a type label (equivalent to an existing xtype: \verb|circle|,

-\verb|polygon|, or \verb|multipolygon|) and the string serialisation of the value

-as described above. For example:

+The value is a polymorphic shape made up of a type label (equivalent to an existing simple

+geometric xtype and the string serialisation of the value as described above.

+

+The allowed shapes are: \verb|circle|, \verb|range|, \verb|polygon|. For example:

I have two additional questions here:

  1. coming back on what @mbtaylorhttps://github.com/mbtaylor said, how ESO would be able to describe their complicated s_regions (e.g. UNION(POLYGON..POLYGON...))? With a polygon? A multipolygon? It seems not very trivial to do. Maybe @almicolhttps://github.com/almicol has an idea?
  2. Would it be possible to add moc in the list of allowed shapes? I think it would be a useful way to describe the region covered by an observation in ObsCore.

— Reply to this email directly, view it on GitHubhttps://github.com/ivoa-std/DALI/pull/24#discussion_r1214420019, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADCLQ452LH3HMKINL56XPKLXJHWWPANCNFSM6AAAAAAYTDTMNY. You are receiving this because you were mentioned.Message ID: @.***>

msdemlei commented 1 year ago

On Wed, Jun 07, 2023 at 08:44:35AM -0700, Patrick Dowler wrote:

Presumably we have to add something to one or more registry extensions so services can say which xtypes they support (or require specific error messages for unsupported input data types)... so that applies to TAP at a minimum.

I'm more on the side of specific error messages, but I'm even more on the side of writing specs in a way that people won't have to issue too many error messages like this.

It was always the plan to add table upload support to SIA (now DAP), where one could have a column of geometries and I expect refer to the column with a query param (details TBD). The POS param

Hmyeahwell... how would you implement that, unless you're on a database that somehow magically allows circles and polygons (and possibly even multipolygons) in the same column? Does such a database even exist?

In general, not everyone uses postgresql and pgsphere so we can't hang everything on moc (or any other specific type) and we have

Oh, but MOC is an IVOA standard in no way bound to pgsphere. It's in there because GAVO spent the equivalent of the price of a small server on hiring a postgres consultant to put it in.

If cash-strapped GAVO can do that, I'd say it's not unreasonable to politely ask one of the adopters of SQLServer to shell out a similar amount of money just once and have a MOC extension for SQLServer built; actually, it's not that hard anyway (I was just busy elsewhere and couldn't spend the money on paying someone locally). With a bit of luck you could even get a suitably capable summer student to do it.

MOCs simply are so much cooler than anything else representing geometries on a sphere that it's absolutely worth it.

Meanwhile, as long as I'm allowed to return MOCs when someone uploads polygons or multipolygons or STC-S or whatever, I'll shut up.

pdowler commented 1 year ago

I think this is now "done" and completes what we need for DALI-1.2

The extra section about unsupported-xtype error messages indicates the way I envision implementers (client and service) deal with any xtype(s) we might add in future or how we might change xtype="shape" by adding more subtypes. I could see that happening, and maybe soon if @almicol's ESO use cases require that we add multipolygon now rather than later. Basically, anyone should be able to round-trip values even when they don't grok the xtype (that was always true) and only have to fail with this error message when the usage requires it.

I would like to get this to RFC soon as I think the ADQL would benefit from saying that the string value in the REGION function (constructor) are specified by DALI "shape". I still think we can add that in ADQL-2.1 with minimal delay. Right now, REGION has no normative spec so this can finally fix that with minimal actual change in usage.

msdemlei commented 1 year ago

On Tue, Jun 13, 2023 at 11:54:43AM -0700, Patrick Dowler wrote:

multipolygon now rather than later. Basically, anyone should be able to round-trip values even when they don't grok the xtype (that was always true) and only have to fail with this error message when the usage requires it.

Just for the record, I'm still very much in favour of letting people return different geometry types and qualify the round-tripping requirement accordingly. Yes, that gives the xtypes a bit more meaning that they previously had, but it really helps implementability a lot. Carrying through hidden columns and then deciding when they (rather than the value of some expression) should become visible again is painful and fragile.

pdowler commented 1 year ago

Well, there is nothing in DALI that says anything about how column types and values with xtypes are handled through an input + output cycle. That would be up to a service to specify (TAP, DAP, etc). Here we just define them.

I haven't thought through all the pros and cons, but my very initial instinct is that TAP could potentially have more specific requirements than a more abstract protocol like DAP (SIA), if the latter were to support tabular input. There are details of this sort to work out for WD-TAP-1.2 (because TAP_UPLOAD and the proposed user-created tables do mostly the same work and can use the same code) and that's the place to figure out these details.

pdowler commented 1 year ago

Hi all,

It has been silent here but I would like to merge this and have the WD published to the doc repo so we can move forward toward RFC. I have services that will make use of most of the new xtypes that I'll tweak to do so once the WD is published... probably the same is true for other early implementers so we need to take that step and move on to wider community feedback.

If someone could check the technical stuff (build, references, etc) and merhe, that would be great.

pdowler commented 1 year ago

will fix the above in a separate PR