ivoa-std / ObsCoreExtensionForRadioData

ObsCore model extension for radio data
Creative Commons Attribution Share Alike 4.0 International
0 stars 6 forks source link

dataproduct_type for single dish data (from Alessandra, Vincenzo and Marco) #8

Open Bonnarel opened 1 year ago

Bonnarel commented 1 year ago

From Alessandra Zanichelli, Vincenzo Galluzi, Marco Molinaro

Current values for dataproduct_type as in the preliminary document Data Product Type vocabulary do not seem suitable to describe single dish observational products in order to allow efficient/successful data discovery. Following the same “parent / narrower Term” classification, we propose the value “sdradio” to be used 1) as a parent Term for any type of single dish data or 2) as parent Term associated with a set of more specific, narrower terms identifying more precisely the various data products coming from the possible observing modes. The value “sdradio” identifies the electromagnetic domain of the data product. We would prefer not to use a more generic “singledish” which would be more strictly related to the instrument more than the physical observable (also, single dish instruments are not used in the radio domain only). The optional, free-text dataproduct_subtype parameter could be used for a more detailed description of the data content. A better solution could be to use the sky_scan_mode parameter proposed in Table 1 of IVOA Obscore Extension for Radio data Version 1.0 (IVOA Note 2022-10-14). This last parameter offers the advantage of using a predefined vocabulary, thus avoiding the use of free text.

The following figure shows the main single dish observing mode:

Scan

The following table summarizes the possible values for dataproduct_type and narrower/parent terms associations.

Capture d’écran du 2023-02-08 16-50-17 (*) Single dish radio maps cannot be considered as “image” dataproducts. Data are typically written in (a) table(s), each row containing coordinate positions, timestamp and raw intensity (raw counts) and further processing is required to obtain a proper image. Also, in the more general case data are not acquired on a regular 2D grid in a single map. Typical observations consist of more than one map, to be combined to recover the final image. Maps can be obtained in spectropolarimetric mode, so the most appropriate parent term seems to be “cube”. (**) In principle the crosscan can be executed in raster mode instead of on-the-fly. For this reason the narrower term has been left more generic and the specific description is demanded to dataproduct_subtype.

Note that INAF has no Phased Array Feed receivers onboard the radio telescopes so we are not taking into account cases specific to beamforming techniques. Thus, more values could be needed. Do we have any PAF expert in the RadioIG?

This approach has some advantages: the narrower terms are in principle usable also in other spectral domains, associated with appropriate parent values/dataproduct_subtype. A query may happen in a two-level mode: a generic one can be done on “sdradio” getting back all the data products associated to any narrower term; alternatively a more detailed query can be done directly on one of the narrower terms. We are aware that this proposal is somehow different from the general VO approach because it is strongly related to a particular instrument/telescope design. However, we are motivated by the need to make single dish data discoverable in an effective manner, which could be hardly achieved by using the current ObsCore dataproduct_type values.

Bonnarel commented 1 year ago

From Alessandra Zanichelli, Vincenzo Galluzi, Marco Molinaro

Current values for dataproduct_type as in the preliminary document Data Product Type vocabulary do not seem suitable to describe single dish observational products in order to allow efficient/successful data discovery. Following the same “parent / narrower Term” classification, we propose the value “sdradio” to be used 1) as a parent Term for any type of single dish data or 2) as parent Term associated with a set of more specific, narrower terms identifying more precisely the various data products coming from the possible observing modes. The value “sdradio” identifies the electromagnetic domain of the data product. We would prefer not to use a more generic “singledish” which would be more strictly related to the instrument more than the physical observable (also, single dish instruments are not used in the radio domain only).

From François Bonnarel I think you are right that the data predict type of single dish data has to be discussed. However i don't think creating a specific data product type for radio single dish data is consistent with the current concept of data_product_type which is more on the dataset structure with respect to the axes nature; (because it is important for the tools which may display, render or analyse them) so i think according to following discussion things like "cube", "spectra" , "timeseries" or "dynamic spectra" should be ok. We are still looking for some term for spectropolarimetry product. My feeling is that we need to better describe single dish datasets is new parameters such as observing type or modes or scan modes or whatever;

Bonnarel commented 1 year ago

From Alessandra Zanichelli, Vincenzo Galluzi, Marco Molinaro Current values for dataproduct_type as in the preliminary document Data Product Type vocabulary do not seem suitable to describe single dish observational products in order to allow efficient/successful data discovery. Following the same “parent / narrower Term” classification, we propose the value “sdradio” to be used 1) as a parent Term for any type of single dish data or 2) as parent Term associated with a set of more specific, narrower terms identifying more precisely the various data products coming from the possible observing modes. The value “sdradio” identifies the electromagnetic domain of the data product. We would prefer not to use a more generic “singledish” which would be more strictly related to the instrument more than the physical observable (also, single dish instruments are not used in the radio domain only).

From François Bonnarel I think you are right that the data predict type of single dish data has to be discussed. However i don't think creating a specific data product type for radio single dish data is consistent with the current concept of data_product_type which is more on the dataset structure with respect to the axes nature; (because it is important for the tools which may display, render or analyse them) so i think according to following discussion things like "cube", "spectra" , "timeseries" or "dynamic spectra" should be ok. We are still looking for some term for spectropolarimetry product. My feeling is that we need to better describe single dish datasets is new parameters such as observing type or modes or scan modes or whatever;

From Baptiste Cecconi : I agree with François, the dataproduct_type is about the organization of the data dimensionalities, its axes, etc, not the way it is recorded, nor the spectral range. Mixing the "radio domain", "instrument type" and "data dimensionalities" will make things difficult to separate the various semantic components.

Bonnarel commented 1 year ago

The optional, free-text dataproduct_subtype parameter could be used for a more detailed description of the data content. A better solution could be to use the sky_scan_mode parameter proposed in Table 1 of IVOA Obscore Extension for Radio data Version 1.0 (IVOA Note 2022-10-14). This last parameter offers the advantage of using a predefined vocabulary, thus avoiding the use of free text.

From François Bonnarel yes dataproduct_subtype is free text; However As written above we probably have to introduce new parameters/terms in this extension.

Bonnarel commented 1 year ago

The optional, free-text dataproduct_subtype parameter could be used for a more detailed description of the data content. A better solution could be to use the sky_scan_mode parameter proposed in Table 1 of IVOA Obscore Extension for Radio data Version 1.0 (IVOA Note 2022-10-14). This last parameter offers the advantage of using a predefined vocabulary, thus avoiding the use of free text.

From François Bonnarel yes dataproduct_subtype is free text; However As written above we probably have to introduce new parameters/terms in this extension.

From Mireille Louys : about scan mode , and other information about the way the data were obtained: These metadata belong to the observing configuration applied in the instrument to obtain the data. It makes a category by itself. This need is also caracterised for the high energy data, and it is worth to describe those parameters separately from the data producttype. It is no longer a core property in terms of data discovery , but it is very useful to radio astronomers.

Bonnarel commented 1 year ago

The optional, free-text dataproduct_subtype parameter could be used for a more detailed description of the data content. A better solution could be to use the sky_scan_mode parameter proposed in Table 1 of IVOA Obscore Extension for Radio data Version 1.0 (IVOA Note 2022-10-14). This last parameter offers the advantage of using a predefined vocabulary, thus avoiding the use of free text.

From François Bonnarel yes dataproduct_subtype is free text; However As written above we probably have to introduce new parameters/terms in this extension.

From Mireille Louys : about scan mode , and other information about the way the data were obtained: These metadata belong to the observing configuration applied in the instrument to obtain the data. It makes a category by itself. This need is also caracterised for the high energy data, and it is worth to describe those parameters separately from the data producttype. It is no longer a core property in terms of data discovery , but it is very useful to radio astronomers.

From Baptiste Cecconi : There are already 3 types of pointing listed in the "ObsLocTAP" standard, in a "tracking_type" keyword. The current values are: "tracking", "solar-system-object-tracking", "fixed-az-el-transit". This seems to call for a list of an external terms, which would be maintained in with Semantics WG.

Bonnarel commented 1 year ago

About the sdradio term (see table in issue) François Bonnarel:

i would say this is more something like an "observation_type" . "sdradio" will differ from "interferometry"

Bonnarel commented 1 year ago

(*) Single dish radio maps cannot be considered as “image” dataproducts. Data are typically written in (a) table(s), each row containing coordinate positions, timestamp and raw intensity (raw counts) and further processing is required to obtain a proper image. Also, in the more general case data are not acquired on a regular 2D grid in a single map. Typical observations consist of more than one map, to be combined to recover the final image. Maps can be obtained in spectropolarimetric mode, so the most appropriate parent term seems to be “cube”.

From François Bonnarel : sure this looks like a (sparse) cube;

Bonnarel commented 1 year ago

Note that INAF has no Phased Array Feed receivers onboard the radio telescopes so we are not taking into account cases specific to beamforming techniques. Thus, more values could be needed. Do we have any PAF expert in the RadioIG?

From François Bonnarel :

i think LOFAR and Nenufar people do have this; Yan ? Baptiste ? Alan ?

Bonnarel commented 1 year ago

Note that INAF has no Phased Array Feed receivers onboard the radio telescopes so we are not taking into account cases specific to beamforming techniques. Thus, more values could be needed. Do we have any PAF expert in the RadioIG?

From François Bonnarel :

i think LOFAR and Nenufar people do have this; Yan ? Baptiste ? Alan ?

From Baptiste Cecconi :

About phased arrays, yes, I think Alan and myself can give some inputs.

Bonnarel commented 1 year ago

This approach has some advantages: the narrower terms are in principle usable also in other spectral domains, associated with appropriate parent values/dataproduct_subtype. A query may happen in a two-level mode: a generic one can be done on “sdradio” getting back all the data products associated to any narrower term; alternatively a more detailed query can be done directly on one of the narrower terms.

From François Bonnarel :

if we consider all this is done by a (some) new parameter(s) to describe the observation (and not the product type) do we prefer several parameters or one single parameter with a hierarchy of terms ?

Bonnarel commented 1 year ago

We are aware that this proposal is somehow different from the general VO approach because it is strongly related to a particular instrument/telescope design. However, we are motivated by the need to make single dish data discoverable in an effective manner, which could be hardly achieved by using the current ObsCore dataproduct_type values.

From François Bonnarel : in other words : we have to distinguish the description of the observation (which is something like a provenance) from the type of the data which is important for the usage of the data; so really, again, I think we need a (some) new parameters in the extension

Bonnarel commented 1 year ago

We are aware that this proposal is somehow different from the general VO approach because it is strongly related to a particular instrument/telescope design. However, we are motivated by the need to make single dish data discoverable in an effective manner, which could be hardly achieved by using the current ObsCore dataproduct_type values.

From François Bonnarel : in other words : we have to distinguish the description of the observation (which is something like a provenance) from the type of the data which is important for the usage of the data; so really, again, I think we need a (some) new parameters in the extension

From Mireille Louys :I suggest having a special extension for observing configuration. this would also fit to other domain like X rays, high energy , etc .

molinaro-m commented 1 year ago

That's all fine for me. Please only consider we never went through proper registry filtering based on extensions. This has to be brought up at some point.