Closed astrofrog closed 2 years ago
tried, got:
Traceback (most recent call last):
File "/blue/adamginsburg/adamginsburg/repos/glue/glue/main.py", line 190, in start_glue
datasets = load_data_files(datafiles)
File "/blue/adamginsburg/adamginsburg/repos/glue/glue/main.py", line 105, in load_data_files
datasets.append(load_data(df))
File "/blue/adamginsburg/adamginsburg/repos/glue/glue/core/data_factories/helpers.py", line 266, in load_data
d = as_list(factory(path, **kwargs))
File "/blue/adamginsburg/adamginsburg/repos/casa-formats-io/casa_formats_io/glue_factory.py", line 78, in read_casa_table
datasets.append(table_to_glue_data(table, label=label_prefix + f' [DATA_DESC_ID={data_desc_id}]'))
File "/blue/adamginsburg/adamginsburg/repos/casa-formats-io/casa_formats_io/glue_factory.py", line 35, in table_to_glue_data
raise ValueError(f"Table {colname} is too wide to expand into single columns: {table[colname].shape}")
ValueError: Table FLAG is too wide to expand into single columns: (9150, 2048, 2)
I'll try w/other subtables
So this looks like the column is (time, frequency, polarization). Perhaps it makes more sense to load something like this as an ndcube data set?
(I was able to load subtables fine, but they're uninteresting)
Ok interesting yes indeed I think we need a different approach. Could you share the ms file with me?
MS shared.
Frequencies are defined in the SPECTRAL_WINDOW
table, accessible through casa's tb.open:
CASA <16>: tb.getcoldesc('CHAN_FREQ')
Out[16]:
{'comment': 'Center frequencies for each channel in the data matrix',
'dataManagerGroup': 'StandardStMan',
'dataManagerType': 'StandardStMan',
'keywords': {'MEASINFO': {'TabRefCodes': array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 64], dtype=uint64),
'TabRefTypes': array(['REST', 'LSRK', 'LSRD', 'BARY', 'GEO', 'TOPO', 'GALACTO', 'LGROUP',
'CMB', 'Undefined'], dtype='<U16'),
'VarRefCol': 'MEAS_FREQ_REF',
'type': 'frequency'},
'QuantumUnits': array(['Hz'], dtype='<U16')},
'maxlen': 0,
'ndim': 1,
'option': 0,
'valueType': 'double'}
that tells the metadata
the different frequencies are in columns:
CASA <20>: tb.getcol("CHAN_FREQ", nrow=1).shape
Out[20]: (2048, 1)
CASA <21>: tb.getcol("CHAN_FREQ", nrow=1, startrow=1).shape
Out[21]: (1920, 1)
CASA <22>: tb.getcol("CHAN_FREQ", nrow=1, startrow=2).shape
Out[22]: (1920, 1)
and the frame is given in MEAS_FREQ_REF
For reading the polarization codes in file.ms/POLARIZATION
in the CORR_TYPE
column, this should be the right conversion:
This is translating from AIPS memo 117 (http://www.aips.nrao.edu/aipsmemo.html) explained in pyuvdata (https://github.com/RadioAstronomySoftwareGroup/pyuvdata/blob/406b88eb91f511f6a5d50ded29e7615c3df4f2a1/docs/references/beamfits_memo.tex#L89) and translated to the MS format (https://github.com/RadioAstronomySoftwareGroup/pyuvdata/blob/b382be644848abe315df035d1cec23f694e65427/pyuvdata/uvdata/ms.py#L37).
From a VLA L-band observation with full Stokes in all spectral windows:
tb.open('table.ms/POLARIZATION')
CASA <45>: tb.getcol("CORR_TYPE", nrow=1)
Out[45]:
array([[5],
[6],
[7],
[8]])
CASA <46>: tb.getcol("CORR_TYPE", nrow=1).shape
Out[46]: (4, 1)
And just a note from my side - here's an example of column shapes:
UVW (724500, 3)
FLAG (724500, 12, 2)
WEIGHT (724500, 2)
SIGMA (724500, 2)
ANTENNA1 (724500,)
ANTENNA2 (724500,)
ARRAY_ID (724500,)
DATA_DESC_ID (724500,)
EXPOSURE (724500,)
FEED1 (724500,)
FEED2 (724500,)
FIELD_ID (724500,)
FLAG_ROW (724500,)
INTERVAL (724500,)
OBSERVATION_ID (724500,)
PROCESSOR_ID (724500,)
SCAN_NUMBER (724500,)
STATE_ID (724500,)
TIME (724500,)
TIME_CENTROID (724500,)
DATA (724500, 12, 2)
MODEL_DATA (724500, 12, 2)
CORRECTED_DATA (724500, 12, 2)
``
Until glue supports columns with different shapes, we could broadcast all arrays to the 3D shape of DATA - with the exception of UVW which we could split into x/y/z components.
I need to check if the ,2 in the dimension of SIGMA is the same as in DATA.
It might also make sense to split out polarization into separate columns and keep (time, frequency) as 2D images, because otherwise the cubes will be very skinny along the polarization dimensions and not sure how useful that is.
@keflavich - can you try this latest version?
Merging #33 (eb32a47) into main (95403b8) will increase coverage by
0.83%
. The diff coverage is69.15%
.
@@ Coverage Diff @@
## main #33 +/- ##
==========================================
+ Coverage 55.66% 56.50% +0.83%
==========================================
Files 17 18 +1
Lines 2118 2230 +112
==========================================
+ Hits 1179 1260 +81
- Misses 939 970 +31
Impacted Files | Coverage Δ | |
---|---|---|
casa_formats_io/glue_factory.py | 65.21% <65.21%> (ø) |
|
casa_formats_io/casa_low_level_io/table.py | 94.66% <93.33%> (+0.63%) |
:arrow_up: |
casa_formats_io/casa_dask.py | 95.38% <0.00%> (+0.26%) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 95403b8...eb32a47. Read the comment docs.
Looks good. Some questions:
I got this when trying to load the Table Viewer
Traceback (most recent call last):
File "/blue/adamginsburg/adamginsburg/repos/glue/glue/utils/qt/decorators.py", line 61, in decorated
return f(*args, **kwargs)
File "/blue/adamginsburg/adamginsburg/repos/glue/glue/viewers/common/qt/data_viewer.py", line 217, in add_data
return super(DataViewer, self).add_data(data)
File "/blue/adamginsburg/adamginsburg/repos/glue/glue/viewers/common/viewer.py", line 207, in add_data
layer = get_layer_artist_from_registry(data, self) or self.get_data_layer_artist(data)
File "/blue/adamginsburg/adamginsburg/repos/glue/glue/viewers/common/viewer.py", line 238, in get_data_layer_artist
return self.get_layer_artist(self._data_artist_cls, layer=layer, layer_state=layer_state)
File "/blue/adamginsburg/adamginsburg/repos/glue/glue/viewers/table/qt/data_viewer.py", line 338, in get_layer_artist
return cls(self, self.state, layer=layer, layer_state=layer_state)
File "/blue/adamginsburg/adamginsburg/repos/glue/glue/viewers/table/qt/data_viewer.py", line 179, in __init__
super(TableLayerArtist, self).__init__(viewer_state,
File "/blue/adamginsburg/adamginsburg/repos/glue/glue/viewers/common/layer_artist.py", line 29, in __init__
self._viewer_state.layers.append(self.state)
File "/home/adamginsburg/.local/lib/python3.9/site-packages/echo/containers.py", line 52, in append
self.notify_all()
File "/home/adamginsburg/.local/lib/python3.9/site-packages/echo/containers.py", line 45, in notify_all
callback(*args, **kwargs)
File "/home/adamginsburg/.local/lib/python3.9/site-packages/echo/containers.py", line 166, in __call__
self.function(*args, **kwargs)
File "/home/adamginsburg/.local/lib/python3.9/site-packages/echo/containers.py", line 189, in callback
self.notify(instance, wrapped_list, wrapped_list)
File "/home/adamginsburg/.local/lib/python3.9/site-packages/echo/core.py", line 125, in notify
cback(new)
File "/blue/adamginsburg/adamginsburg/repos/glue/glue/viewers/table/qt/data_viewer.py", line 298, in _on_layers_changed
self.model = DataTableModel(self)
File "/blue/adamginsburg/adamginsburg/repos/glue/glue/viewers/table/qt/data_viewer.py", line 40, in __init__
raise ValueError("Can only use Table widget for 1D data")
ValueError: Can only use Table widget for 1D data
Example: can show that the high values all come from autocorrelations:
(note that the high stuff around 17,050 is selected but I can't see it highlighted in red - looks like a visual error in the 1D profile selection. You can see the selection in both the 2D planes though)
I think we should merge this and deal with additional issues in future PRs
Sounds good, let's merge, and I'll open separate issues for the outstanding things.
I think it would make sense to have this live here since it integrates with the low-level API for the I/O. To test out, install and try e.g.
glue blah.ms
orglue blah.ms/SUBTABLE
This needs https://github.com/glue-viz/glue/pull/2249 to work well.