astropy / pyvo

An Astropy affiliated package providing access to remote data and services of the Virtual Observatory (VO) using Python.
https://pyvo.readthedocs.io/en/latest
BSD 3-Clause "New" or "Revised" License
77 stars 52 forks source link

Support for Custom `REQUEST` Methods in `TAPService` Queries #579

Open NucleonGodX opened 3 months ago

NucleonGodX commented 3 months ago

Description:

I am proposing an enhancement to the pyvo.dal.TAPService functionality to support custom REQUESTmethods, such as doQueryFilteredByDistance, in addition to the default doQuery. This feature will provide greater flexibility and utility for users, particularly in the context of integrating the TAPService with the sunpy-soar module.

Context:

The current implementation of the pyvo.dal.TAPService class hardcodes the REQUEST parameter to doQuery. However, there are scenarios where alternative REQUESTmethods are required to perform more specialized queries, such as filtering results by distance using doQueryFilteredByDistance. This was identified during the development and integration efforts within the sunpy-soarproject, where the ability to perform such queries would be helpful for effectively interfacing with the Solar Orbiter Archive (SOAR) API.

Proposed Solution:

Enhance the TAPService class to accept an request_type argument, with the default value set to doQuery. This argument will allow users to specify the desired REQUEST method when constructing their queries.

msdemlei commented 3 months ago

On Sat, Jul 13, 2024 at 07:20:11AM -0700, Manit Singh wrote:

Proposed Solution:

Enhance the TAPService class to accept an request_type argument, with the default value set to doQuery. This argument will allow users to specify the desired REQUEST method when constructing their queries.

Hm... I agree that the value of REQUEST should be configurable to enable various (at this point non-standard) functionality. TOPCAT, for instance, has this "Quick Look" thing that I've never looked into but that I suspect sets some custom REQUEST value, too, and there's the ancient plan to support retrieving query plans. But I don't think TAPService is the right place to put this value; the normal flow would be, I suppose, to, for instance, first retrieve the plan and then send the query if all looks fine, all on the same service object.

This would suggest to me that your request_type ought to be a parameter to run_sync; how non-standard REQUEST parameters should interact with async mode I can't quite see now, but my suspicion is (based on the plan example) that different REQUEST "types" would typically be different endpoints on a Job (but then your doQueryFilteredByDistance -- which I'd be curious about, I have to say; perhaps what you're after would better be served by an ADQL extension? Feel free to post your use case on the DAL list -- example sounds as if that might translate to async, too).

A tricky question is the name of the parameter. The choice of the REQUEST parameter name, a leftover of a (mis-)design of the SSAP era, was IMHO unfortunate on the side of the protocol designers, because "request" is pretty much a term reserved for so many other things. So, I'm pretty sure we shouldn't call the parameter "request", even though that would be most logical in order to keep the run_sync interface as parallel to the actual service interface as possible.

request_type, as you suggest, is probably the least bad option, though I'd be hard pressed to explain how getPlan (say) would be a type. I've thought about request_value, request_method (ugh... clearly much too close to the HTTP method), or request_verb, too; the latter we perhaps should think about for a bit, because it seems to me the actual values of REQUEST would typically be verb phrases.

So: you have my vote in principle, but the details seem a bit messy.