sunpy / sunpy-soar

A sunpy plugin for accessing data in the Solar Orbiter Archive (SOAR).
https://docs.sunpy.org/projects/soar/
BSD 2-Clause "Simplified" License
18 stars 12 forks source link

Query SOAR metadata #46

Open ebuchlin opened 2 years ago

ebuchlin commented 2 years ago

Describe the feature

My understanding is that sunpy-soar currently only supports queries by instrument / time / level / product, as this is basically what is available in the SOAR web query form and in the v_sc_data_item and v_ll_data_item tables. However, the user should also be able to do queries with different metadata (other Fido attributes).

Proposed solution

The list of all tables and their columns is available from SOAR with TAP. I attach a human-readable version (tree by schema / table / column), generated by XSLT with this XSL stylesheet.

This shows that more complete metadata are available in SOAR, in instrument-specific tables, e.g. v_spi_sc_fits. Example query: http://soar.esac.esa.int/soar-sl-tap/tap//sync?REQUEST=doQuery&LANG=ADQL&FORMAT=json&QUERY=SELECT+TOP+10+%2A+FROM+v_spi_sc_fits

Fido attributes should be linked to columns in these different instrument-specific tables. For a query with multiple instruments, multiple tables should be queried... (or should this not be supported?). LL files metadata can also be queried, from still different tables.

ebuchlin commented 1 year ago

There is a draft documentation for the tables, views, and columns of the SOAR TAP interface: https://www.cosmos.esa.int/web/soar/tables-views-and-columns

ebuchlin commented 1 year ago

There are now new columns soop_name and soop_type available in the SOAR TAP interface.

wtbarnes commented 1 year ago

There are now new columns soop_name and soop_type available in the SOAR TAP interface.

Is this covered by #84?

ebuchlin commented 1 year ago

There are now new columns soop_name and soop_type available in the SOAR TAP interface. Is this covered by #84?

For queries by SOOP name it seems that #84 covers it, yes.

wtbarnes commented 1 year ago

@ebuchlin sorry for the lack of traffic on this issue. I definitely agree we should be supporting more complex queries against the SOAR through Fido, but I'm a bit confused as to the scope. Looking at the docs you linked above, it is not quite clear to me what attributes should be supported through the attrs interface. Could you provide an example of what a Fido query would look like with these additional metadata?

The example from @hayesla in #66 makes it a bit more clear to me, but again the issue is what subset of that metadata we should support. I don't think it is practical to try and translate each bit of SOAR metadata to a Fido attribute. However, maybe there could be some sort of interface to specifying these filters as strings, similar to what we allow with JSOC keywords in the sunpy.net.jsoc.attrs.

ebuchlin commented 1 year ago

This is a generic issue meant to tell that there were more possibilities with the SOAR TAP interface than the ones initially used by sunpy-soar (the details of the TAP interface were undocumented at that time). Now that we have some documentation and that queries by SOOP name have been implemented, we can be more specific about the potentially other useful attributes, starting from existing sunpy.net ones:

An issue is that v_<instrument>_<ll/sc>_fits is actually multiple tables, one per instrument and per data type (low-latency or science), and that there must then be join operations with the v_<ll/sc>_data_item tables.

In case one would like to have access to previous versions of the files (instead of only the latest version), the v_<ll/sc>_repository_file tables would also have to be considered.

For a start, we can of course ignore previous versions of files, ignore low-latency observations, and prioritize attributes in the above list. The efforts should also be balanced with those put on access to Solar Orbiter data through VSO as data provider.

ebuchlin commented 9 months ago

For complex SOAR TAP queries, here is a tutorial on TAP queries that we did at IAS; it could provide ideas for how to do some of the queries we would like to be doable using Fido.

nabobalis commented 9 months ago

@ebuchlin we are going to add this as a GSoC project and I have a really rough draft here: https://github.com/OpenAstronomy/openastronomy.github.io/pull/350/files#diff-03a99800468bb348b3741103deee0d442348ced2997c4a20c1aa6479cd7729e9

If you had time could you review it and would you be willing to help with the project in an advisory capacity?

ebuchlin commented 9 months ago

I have added a small comment to the GSoC project, and yes I am willing to help.

nabobalis commented 9 months ago

Thanks!

MetaphorC commented 7 months ago

Hey! This issue is part of the GSoC projects. I would like to work on it, and with the organisation in general. I am new to working with open-source projects, but I will try my best to help. Where should I start?

nabobalis commented 7 months ago

Hey! This issue is part of the GSoC projects. I would like to work on it, and with the organisation in general. I am new to working with open-source projects, but I will try my best to help. Where should I start?

Welcome and glad to hear you are interested in contributing.

We recommend that everyone starts with reading https://docs.sunpy.org/en/latest/dev_guide/contents/newcomers.html to get started. This will walk you through getting a development environment setup. When that is complete, the next step is to start tackling some good first issues which are linked in that guide.

Our GSoC advice is on our Wiki: https://github.com/sunpy/sunpy/wiki/Google-Summer-of-Code

If you have any questions or problems do please let us know but we encourage all communications to occur on our public chat room; https://matrix.to/#/#sunpy:openastronomy.org

MetaphorC commented 7 months ago

Understood, I'll go through this tonight, and joining the chatroom. Looking forward to working with everyone! Thank You!

Dhruvkumar0463 commented 7 months ago

Hi, I have seen this enhancement while going through GSoC projects overall this initiative to find astronomical data is interesting I went through your metadata and felt that we can query SOAR by many other attributes also. As I completed my data analysis course just now I would like to work on this project will come up with initial draft of feature-design in 1 day Thanks, Dhruvkumar Patel

nabobalis commented 7 months ago

Hi, I have seen this enhancement while going through GSoC projects overall this initiative to find astronomical data is interesting I went through your metadata and felt that we can query SOAR by many other attributes also. As I completed my data analysis course just now I would like to work on this project will come up with initial draft of feature-design in 1 day Thanks, Dhruvkumar Patel

Hello @Dhruvkumar0463, as I said to MetaphorC above, reading that and following the links to get setup and familiar with how we do GSoC would be better. I will say, there are 3 days left and that is a tight turnaround.

hayesla commented 5 months ago

Following discussion - a good place to start will be to try look at adding the Detector attribute for EUI. This will be a good test case to figure out the way we plan to join tables etc.

Myself and @ebuchlin will think of attributes users of Solar Orbiter would want to query over etc before the next meeting

Dhruvkumar0463 commented 5 months ago

Thanks Laura .. Will look into this and get back to you

On Wed, May 22, 2024, 9:26 PM Laura Hayes @.***> wrote:

Following discussion - a good place to start will be to try look at adding the Detector attribute for EUI. This will be a good test case to figure out the way we plan to join tables etc.

Myself and @ebuchlin https://github.com/ebuchlin will think of attributes users of Solar Orbiter would want to query over etc before the next meeting

— Reply to this email directly, view it on GitHub https://github.com/sunpy/sunpy-soar/issues/46#issuecomment-2125144654, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQ5W7LSDPMJAPHUFWH44ENDZDS5ZLAVCNFSM57P2Y5Y2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMJSGUYTINBWGU2A . You are receiving this because you were mentioned.Message ID: @.***>

esdcheliodevops commented 4 months ago

Hello @ebuchlin and @hayesla - I would just like to comment that if you find the current structure of Tables in SOAR TAP difficult to work with, then please do make suggestions about how we can improve that. :-)

They are currently structured with the internal relational database in mind. But of course, we are always open to the possibility of making more user friendly views to combine data etc. This would help avoid making complex queries with joins which are often slow due to lack of indexes etc on certain columns.

It would be great to capture this kind of feedback which would surely benefit the whole community of SOAR TAP users.

Many thanks, Jonathan Cook (I am using a shared ESDC github account we have)

ebuchlin commented 2 months ago

Hello, here is a new analysis (notebook PDF, notebook source) of what could be done with the following keywords, with the current way they are filled (even when most of them are optional keywords) by the instrument teams and/or the SOAR: