IQSS / dataverse

Open source research data repository software
http://dataverse.org
Other
857 stars 481 forks source link

RA/DEC in Astronomy and Astrophysics Metadata #3526

Open jggautier opened 7 years ago

jggautier commented 7 years ago

This issue is to track work being done regarding several issues someone raised in the Dataverse Users Community about the types of metadata being captured for FITS files. https://groups.google.com/forum/#!msg/dataverse-community/Zo-v4T5SW-Q/npNowXh_DwAJ

jggautier commented 7 years ago

I emailed the developer who raised the issue for clarification, and emailed some local domain experts for their help.

jggautier commented 7 years ago

Related to issues #587 and #615.

asconrad commented 7 years ago

We have taken a closer look at this issue from the point of view of the Kepler case discussed in https://groups.google.com/forum/#!msg/dataverse-community/Zo-v4T5SW-Q/npNowXh_DwAJ.

What we specifically want, is to allow a user of Dataverse to discover datasets based on user-defined coverage search criteria. This would in our understanding require RA and DEC to be defined as numeric fields, enabling calculating areas around the specified location.

Sky Coverage, which is used in Dataverse, on the other hand, is defined as a string, allowing the curator to define an area of the sky that a certain observation is covering. This however, will not allow our users to create the search criteria we are interested in.

The issue is - if I understand it rightly - that the IVOA recommendation behind the astronomy metadata in Dataverse, is adressing the description of resources, leaving discovery to "lower levels". Specifically I understand the this recommendation adresses the construction of a registry of resources, e.g. services, in the VO architecture. Whereas, in fact, we understand Dataverse as such a service.

The OBScoreDM (http://ivoa.net/documents/ObsCore/20161004/PR-ObsCore-v1.1-20161004.pdf) is a IVOA specification for the TAP protocol, including a query service. Here we find the fields we would like, s_ra and s_dec, both expressed as doubles.

I am not sure how that would fit with Dataverse, or what has been the deeper reason for choosing a registry specification to be used here. We would be interested in any comments or suggestions that would help discovery based on user-defined areas of the sky.

pdurbin commented 7 years ago

@asconrad thanks for your comment! For non-astronomers like myself, "RA" means "Right Ascension" and "DEC" means "Declination", which I found at https://en.wikipedia.org/wiki/Equatorial_coordinate_system

https://github.com/IQSS/dataverse/blob/v4.6/scripts/api/data/metadatablocks/astrophysics.tsv is where the current astronomy metadata block is defined, but there's a more human readable version at https://docs.google.com/spreadsheet/ccc?key=0AjeLxEN77UZodHFEWGpoa19ia3pldEFyVFR0aFVGa0E#gid=3 which is linked from http://guides.dataverse.org/en/4.6/user/appendix.html . As @asconrad has observed, "Sky Coverage", which is field coverage.Spatial is text with RA and DEC put together in this single field. Anders worded this as follows:

We are a little puzzled that these important fields don’t seem to be included in the Astronomy and Astrophysics Metadata. When trying to import a FITS file into Dataverse, RA and DEC are put in paranthesis in the string coverage.Spatial field, as in the example

                     {
                        "typeName": "coverage.Spatial",
                        "multiple": true,
                        "typeClass": "primitive",
                        "value": ["(291.22395900000004 36.88712)"]
                    }

Even though Sky Coverage is only a text field, I wonder if we change it so that it appears on the Advanced Search page at least (this column is set to false in the definition of the astronomy metadata block linked above.) As Anders points out, RA and DEC should be stored separately. #370 is issue I opened a while back having to do with better searching of ranges, of numbers, from within Dataverse.

asconrad commented 7 years ago

@pdurbin, thank you, yes I think you got me right (I am also not an astronomer ;-) RA and DEC are celestial coordinates (like geocodes), together they define a position at the sky. And yes, in order to allow a future researcher to identify datasets for a certain area of the sky, we would need to be able to have range search on these two values.