Closed Jan-Willem closed 6 days ago
Comment on the questions 1.1 and 1.2.1: There is a non-imaging usecase of ephemeris data, flux calibration. The extra columns listed in the CASA Ephemeris Data are used by setjy in the current CASA for the usecase, to determine flux density of a solar system object used as a primary flux calibrator. These columns were specifically requested by Brian Buttler but in his implementation (https://open-bitbucket.nrao.edu/projects/CASA/repos/casa6/browse/casatasks/src/private/solar_system_setjy.py), but I think some of the data columns are not used in the code. Since these additional data will be only needed for a subset of the solar system objects for flux calibration, these can be optional. The ALMA Memo, https://library.nrao.edu/public/memos/alma/main/memo594.pdf, has a formal description of the flux calculation.
A couple of comments mainly from SD perspective.
ps['MSv4_name'].VISIBILITY.field_and_source_xds
Just in case, is it ps['MSv4_name'].SPECTRUM.field_and_source_xds
for single-dish?
Is sky_dir_label
an arbitrary string rather than fixed to ['ra', 'dec']
? For example, in the Galactic coordinate, labels are ['l', 'b']
.
For FIELD_REFERENCE_CENTER
, there could be usecases that requires multiple reference positions. For example, when we cannot find a good reference field with similar elevation value to target field, we could use two reference positions with similar azimuth and upper/lower elevations to interpolate reference spectra into target position. I'm not sure it's feasible. But I remember it was discussed in the context of ALMA and/or NRO 45m although it was never implemented.
Another usecase that could not be supported by FIELD_REFERENCE_CENTER
is so-called "horizontal reference". Because elevation difference between target and reference can cause degradation of calibrated spectra, ALMA sometimes tries to take reference data at the same elevation with target field. Since this cannot be done by the fixed position in celestial coordinate (neither absolute position nor relative position from target field), reference field consequently moves with time. It seems the horizontal reference is a default for extra-Galactic TP observation in ALMA.
Yet another comment which is not SD specific. Regarding FIELD_HASE_CENTER_OFFSET
etc. for ephemeris case, are they constant over time? I'm not an expert so I'm just asking if the assumption is OK. In Az/El mount, field of view rotates with time. So, I'm wondering if offset could also rotate...
Personally FIELD_REFERENCE_CENTER
is good idea from technical point of view because it describes an association between reference and target explicitly and robustly. But I'm afraid that this is logically justified. That is because usually reference fields specifies "void" region so it is totally unrelated to the target field.
Thank you @taktsutsumi and @tnakazato for your initial comments.
Few more comments:
1.2.2 VS_CREATE, VS_DATE, and VS_VERSION are just for information, I believe (although Measures checks for existence of these keywords, I don't think it is used in the code), so they are not needed. VS_TYPE - used stored type of different Measure tables so this is not need (and field_and_source_xds_type attribute essentially provide the same info.)
There is one keyword recently added in getephemtable is ‘ephemeris_source’ indicating name of ephemeris data source, such as DE200, DE441, etc. Knowing the origin of the ephemeris data may be helpful information. Note that JPL-Horizons query will use latest (e.g. DE440/DE441) but astropy 6.1’s default ephemerides is DE430.
MJD0, dMJD, earliest, latest - I think these are there in keywords for quick access of the data without accessing time column in the Table system. Since these can be determined from the data, I don’t think these are needed.
radii - used in flux model meanrad - can be determined from radii rot_per - the rotation period information may be needed to create de-rotated image? orb_per - probably not needed?
1.5 SOURCE_RADIAL_VELOCITY's unit can be km/s (JPL-Horizons gives this value in AU/day but when usually it is converted to km/s (or m/s)
1.11 SUB_OBSERVER_DIRECTION → SUB_OBSERVER_POSITION, I believe, this is a position on the surface of the target so planetodetic_location seems to be more appropriate.
SUB_SOLAR_POSITION, similar this is lon, lat of the Sun on the target so planetodetic_location
I think quantity is fine for other variables and yes I think NORTH_POLE_POSITION_ANGLE and NORTH_POLE_ANGULAR_DISTANCE can be in a single data variable.
I'll start by asking why are we having this table. If I understand well the MSV4 contains one field one spw. So this table is not necessary for single field observation or pointed mosaics as each point in the mosaic will be in it own dataset. So phase_centre and delay_centre etc can be meta data or coordinates of
For OTF or near field observation (i.e the correlator is continuously resetting the phase center at a rate that is at most the integration time. then this xds will have many rows (can be as many as the number of integration). Should it go into the main xds as columns ?
About setjy use case. For calibrator xds should not this table have a concept of model. i.e source shape, flux shape with frequency etc... right now this is encoded in setjy code. The MODEL column of SOURCE in MS v2 was supposed to be that but it was never really used except for virtual model column.
===
Some doubts or mistakes found.
) Description of delay_center and phase_center are flipped
) What is DOPPLER_SHIFT_VELOCITY for ? Are we doppler tracking (i.e removing the observatory velocity w.r.t the frame of the spw definition) when this value is assigned. What is its relationship with LINE_SYSTEMIC_VELOCITY ?
Isn't systemic velocity a source based parameter (just like SOURCE_PROPER_MOTION)...i.e the global radial velocity w.r.t LSRK of the source for e.g . Not clear what systemic velocity for every line means.
In the schema, I have added a time axis to FIELD_PHASE/DELAY_CENTER to support OTF.
@taktsutsumi questions and comments:
@kgolap questions and comments:
_I'll start by asking why are we having this table. If I understand well the MSV4 contains one field one spw. So this table is not necessary for single field observation or pointed mosaics as each point in the mosaic will be in it own dataset. So phase_centre and delaycentre etc can be meta data or coordinates of
For OTF or near field observation (i.e the correlator is continuously resetting the phase center at a rate that is at most the integration time. then this xds will have many rows (can be as many as the number of integration). Should it go into the main xds as columns?
About setjy use case. For calibrator xds should not this table have a concept of model. i.e source shape, flux shape with frequency etc... right now this is encoded in setjy code. The MODEL column of SOURCE in MS v2 was supposed to be that but it was never really used except for virtual model column.
“Some doubts or mistakes found”:
“Description of delay_center and phase_center are flipped:” Yes, thank you for finding that.
_What is DOPPLER_SHIFT_VELOCITY for ? Are we doppler tracking (i.e removing the observatory velocity w.r.t the frame of the spw definition) when this value is assigned. What is its relationship with LINE_SYSTEMIC_VELOCITY ? Isn't systemic velocity a source based parameter (just like SOURCE_PROPERMOTION)...i.e the global radial velocity w.r.t LSRK of the source for e.g . Not clear what systemic velocity for every line means.
Review Instructions
These instructions are repeated in the review_field_and_source_xds.ipynb that can be found in xradio/reviews. The notebook includes a demo of an ALMA mosaic ephemeris observation of the sun that should be used for the review.
Please review the MSv4
field_and_source_xds
schema and the XRADIO interface (ps['MSv4_name'].VISIBILITY.field_and_source_xds
). The PS (processing set) interface or the main_xds should not be reviewed.The
field_and_source_xds
schema specification: https://docs.google.com/spreadsheets/d/14a6qMap9M5r_vjpLnaBKxsR9TF4azN5LVdOxLacOX-s/edit#gid=1658760192Preparatory Material
Go over Xarray nomenclature and selection syntax:
MSv2 and CASA documentation:
field_and_source_xds
SchemaThe FIELD, SOURCE, and EPHEMERIS tables in the MSv2 contain closely related information:
These can be combined into a single dataset for MSv4 because it consists of a single field and consequently a single source[^1].
Use Cases
The use cases considered during the design of the schema were:
To satisfy these use cases, two types of
field_and_source_xds
were created: standard and ephemeris. The main difference is that the ephemeris type has aFIELD_PHASE_OFFSET
data variable that is relative to theSOURCE_POSITION/SOURCE_DIRECTION
data variable (contains the ephemerides and has a time axis), while the standard type hasFIELD_PHASE/DELAY/REFERENCE_CENTERS
andSOURCE_POSITION
(has no time axis). TheSOURCE_POSITION/DIRECTION
is kept separate from theFIELD_PHASE_OFFSET/CENTER
so that the intentOBSERVE_TARGET#OFF_SOURCE
is supported and the ephemeris can be easily changed.Key Questions to Answer
Schema Questions
1.1) Are there missing use cases?
1.2) Is all the information present needed for offline processing?
VS_CREATE
,VS_DATE
,VS_TYPE
,VS_VERSION
,MJD0
,dMJD
,earliest
,latest
,radii
,meanradm
,orb_per
,rot_per
. Do we need any of these?1.3) Is there a use case where the
FIELD_PHASE_CENTER
andFIELD_DELAY_CENTER
would differ (i.e., do we need to store both)?1.4) For interferometer observations, do we need to store the
FIELD_REFERENCE_CENTER
or can this be omitted (will still be present for Single dish)?1.5) The ephemeris data is recorded in degrees, AU, and MJD. Should these be converted to radians, meters, and time (Unix)? Note that each data variable has measurement information attached to it. For example:
1.6) For ephemeris observations, should we add the SOURCE_PROPER_MOTION if available?
1.7) Is the name
field_and_source_xds
sufficiently descriptive?1.8) Should we also add the DOPPLER table information to the schema (if so, any idea where we can get an MSv2 with a DOPPLER table)?
1.9) Any naming suggestions or data layout?
1.10) Are the data variable descriptions in the schema spreadsheet correct?
1.11) What measures (https://docs.google.com/spreadsheets/d/14a6qMap9M5r_vjpLnaBKxsR9TF4azN5LVdOxLacOX-s/edit#gid=1504318014) should we attach to each of the following data variables
1.12) Can NORTH_POLE_POSITION_ANGLE and NORTH_POLE_ANGULAR_DISTANCE be combined into a single data variable?
1.13) Should we have a specilized schema (type) for Single Dish where we exclude FIELD_PHASE_CENTER and FIELD_DELAY_CENTER?
XRADIO
2.1) After reviewing the XARRAY documentation and the descriptions of the data variables in the
field_and_source_xds
schema, do you find the XARRAY interface intuitive and easy to use?[^1]: This is inhereted from MSv2 that only allows a single source per field [https://casacore.github.io/casacore-notes/229.pdf, p35], though a source can appear in more than one field.
Environment instructions
It is recommended to use the conda environment manager to create a clean, self-contained runtime where xradio and all its dependencies can be installed:
Clone the repository, checkout the review branch and do a local install: