Open miguelcarcamov opened 3 years ago
Hi @miguelcarcamov, I think it depends on how far you want to take it.
You could set up your datasets with the following:
datasets = xds_from_ms(ms, group_cols=["FIELD_ID", "DATA_DESC_ID", "ANTENNA1", "ANTENNA2"])
This will create a unique dataset per combination of FIELD_ID, DATA_DESC_ID and BASELINE. Unfortunately, Measurement Sets are frequently monotonically ordered in TIME, rather than ANTENNA1, ANTENNA2, so the resulting datasets will be backed by reads of non-contiguous rows, which results in inefficient disk read patterns. But it should be fairly easy to calculate the max baseline length per dataset as follows:
import dask
import dask.array
datasets = xds_from_ms(ms, group_cols=["FIELD_ID", "DATA_DESC_ID", "ANTENNA1", "ANTENNA2"])
bl_lengths = []
for ds in datasets:
ant1 = da.full_like(ds.TIME.data, ds.ANTENNA1, dtype=np.int32)
ant2 = da.full_like(ds.TIME.data, ds.ANTENNA2, dtype=np.int32)
bl_lengths.append(da.sqrt((ds.UVW.data[ant2, : ] - ds.UVW.data[ant1, :])**2).max())
dask.compute(bl_lengths)
If you want to do more with the baseline length (i.e. process visibility data), then non-contiguous disk access will hurt performance.
A second approach requires (1) some knowledge of dask internals (2) the ability to process your baseline data on a per-chunk basis.
from __future__ import print_function
import argparse
import dask
import dask.array as da
from daskms import xds_from_ms
import numpy as np
def create_parser():
p = argparse.ArgumentParser()
p.add_argument("ms")
return p
def _process(ant1, ant2, uvw):
uvw = uvw[0] # Contraction over the uvw3 axis
# Identify unique baselines in this chunk
baselines = np.stack((ant1, ant2), axis=1)
ubl, inv = np.unique(baselines, return_inverse=True, axis=0)
# Determine their lengths
bl_length = np.empty(ubl.shape[0], dtype=uvw.dtype)
for i, (a1, a2) in enumerate(ubl):
bl_length[i] = np.sqrt(uvw[i == inv, :]**2).max()
print(bl_length)
# Further processing required beyond this point
if __name__ == "__main__":
args = create_parser().parse_args()
ds = xds_from_ms(args.ms)
ds = ds[0] # Just demonstrate on the first dataset
# Map _process function on input arrays to produce an output arrow
# A good understanding of dask.array.blockwise is advised
process = da.blockwise(_process, ("row",),
ds.ANTENNA1.data, ("row",),
ds.ANTENNA2.data, ("row",),
ds.UVW.data, ("row", "uvw3"),
concatenate=False,
meta=np.empty((), np.object))
dask.compute(process)
I suspect the approach you take will depend on whether you want to crunch the larger visibility data. What are your thoughts?
I ended up using itertools.combinations. Although since I am very new on using dask it might be less efficient than your approach. I would like you to tell me what you think.
antennas = xds_from_table(self.ms_name_dask + "ANTENNA", taql_where=taql_query)[0]
antenna_obj = Antenna(dataset=antennas)
When creating the object antennas it runs this:
self.max_diameter = 0.0 * u.m
self.min_diameter = 0.0 * u.m
if dataset is not None:
self.max_diameter = self.dataset.DISH_DIAMETER.data.max().compute() * u.m
self.min_diameter = self.dataset.DISH_DIAMETER.data.min().compute() * u.m
Then I run:
# Creating baseline object
baseline_obj = antenna_obj.create_baseline_dataset()
This function runs:
def create_baseline_dataset(self):
ids = self.dataset.ROWID.data.compute()
combs = np.array(list(combinations(ids, 2)))
antenna1 = self.dataset.sel(row=combs[:, 0])
antenna2 = self.dataset.sel(row=combs[:, 1])
baseline = antenna1.POSITION - antenna2.POSITION
baseline_length = xarrfunc.sqrt(
xarrfunc.square(baseline[:, 0]) + xarrfunc.square(baseline[:, 1]) + xarrfunc.square(baseline[:, 2]))
baseline_length = baseline_length.data.persist()
row_id = np.arange(len(combs[:, 0]))
ant1_id = da.from_array(combs[:, 0])
ant2_id = da.from_array(combs[:, 1])
row_id = da.from_array(row_id)
ds = xarray.Dataset(
data_vars=dict(
ANTENNA1=(["row"], ant1_id),
ANTENNA2=(["row"], ant2_id),
BASELINE_LENGTH=(["row"], baseline_length)
),
coords=dict(
ROWID=(["row"], row_id)
))
return Baseline(dataset=ds)
Since the baseline lengths are in a xarray dataset we can get the maximum using:
self.max_baseline = self.dataset.BASELINE_LENGTH.max().data.compute() * u.m
self.min_baseline = self.dataset.BASELINE_LENGTH.min().data.compute() * u.m
Let me know if this is not efficient, I would like to use the blockwise function though
Cheers
@sjperkins Ok, I have tested your code and the only downside is that the dask array returned from process is bigger than what we should expect. For example, if we are returning an array of dimensions for the baselines, like (id, antenna1_id, antenna2_id) if we pass row as the first dimension we would end up with a much more bigger dask array. Btw, what do you mean with crunching the visibility data? Well, I would like two things - One of them I have seen it as an issue - which is have antenna1 and antenna2 + baseline_id as a coordinate in the datasets. But also I would like to loop my datasets per baselines and work on each one of them. My idea is to make a function that takes a non-gridded datasets and returns a gridded dataset. For that we need to do the gridding for each field, spw and baseline, so all the ids in the main table fit.
A follow up to this @sjperkins: I've seen the documentation of CASA ngi, and I was wondering how they get to order their data by baseline if the data is not contiguous by baseline... If you convert the data to zarr then you don't get any problem ordering the data by baseline?
A follow up to this @sjperkins: I've seen the documentation of CASA ngi, and I was wondering how they get to order their data by baseline if the data is not contiguous by baseline... If you convert the data to zarr then you don't get any problem ordering the data by baseline?
I don't want to speak too much for the casangi team, but it looks like they enforce a (time, baseline, chan, corr)
shape for their zarr representation.
The MSv2.0 (and Ms3.0) spec specifies a (row, chan, corr)
shape but there is no constraint that the data should be ordered by TIME, ANTENNA1, ANTENNA2
i.e. (time,baseline,chan,corr)
. This ordering is optimal for certain applications like calibration but ANTENNA1,ANTENNA2,TIME
i.e. (baseline,time,chan,corr)
can be more optimal for imaging and flagging. I believe wsclean orders data like this prior to imaging.
Thus, in my opinion, enforcing a(time, baseline, chan, corr)
order deviates form the full generality of the MSv{2,3} spec if we're being very precise and splitting hairs, but in practice, most instruments will output data in this ordering.
I've also only mentioned the TIME,ANTENNA1
and ANTENNA2
coluns in this comment. Technically all the columns in the MAIN Table key are used to impose an ordering: https://casa.nrao.edu/Memos/229.html#SECTION00061000000000000000, so columns like FEED1
and FEED2
are also relevant here.
@sjperkins right, makes sense. Although I think that for self-calibration which can be considered as calibration+imaging ordering by (time,baseline,chan,corr)
is also useful.
However, what I want to do is this: let's say I calculate a baseline_id for each row in my dask-ms dataset which has already been grouped by ["DATA_DESC"]
. Let's say that now I want to regroup the datasets such that they are ordered by ["BASELINE_ID", "FIELD_ID", "DATA_DESC_ID"]
. Then my questions are:
Cheers
@sjperkins right, makes sense. Although I think that for self-calibration which can be considered as calibration+imaging ordering by
(time,baseline,chan,corr)
is also useful.However, what I want to do is this: let's say I calculate a baseline_id for each row in my dask-ms dataset which has already been grouped by
["DATA_DESC"]
. Let's say that now I want to regroup the datasets such that they are ordered by["BASELINE_ID", "FIELD_ID", "DATA_DESC_ID"]
. Then my questions are:
- It is possible to do this? Is it done using xarray groupby? or is there a more efficient way to do this?
It may be possible to do this via xarray groupby but I'm wary of this approach since it'll create a dask graph for each group (baseline) with a lot of cross-communication between chunks. I think this'll work but will either require:
evaluating each group separately with dask, resulting in accessing the entire dataset multiple times.
or expecting the dask (distributed?) scheduler to perfectly handle cross-communication when work for all groups is submitted at once. This is hard: https://coiled.io/blog/better-shuffling-in-dask-a-proof-of-concept/
Having said that I haven't tried this approach in a long time, so the underlying functionality might have improved.
- Does this order would cause not contiguous performance issues? I guess not if using zarr?
One can't really get around this issue, regardless of the storage backend: its a matter of Spatial Locality. If I were to use database terminology, accessing data on the primary key is always more optimal than accessing data via a secondary key because data is usually ordered by primary key on disk.
You might want to try reordering your MS as follows:
dask-ms convert ~/data/input.ms -g "FIELD_ID,DATA_DESC_ID,SCAN_NUMBER" -i "ANTENNA1,ANTENNA2,TIME,FEED1,FEED2" -o ~/data/output.ms --format ms --force
If you've created a BASELINE_ID
column, you could probably substitute that for ANTENNA1,ANTENNA2
.
Thank you very much @sjperkins. I will try what you have suggested and I will let you know. Last question - Is the convert function part of the dask-ms? That is, can I call it as a function from a python file?
Cheers
Thank you very much @sjperkins. I will try what you have suggested and I will let you know.
Note there were some fixes pushed to master this morning, but I don't think there would have been an issue with MS to MS conversion.
Last question - Is the convert function part of the dask-ms? That is, can I call it as a function from a python file?
It's a class in daskms/apps/convert.py
. There are no plans to make this into a generic function.
@sjperkins Hi again! quick question - How can I use convert from a piece of code directly with the Convert
class and without using os.system
? or would I need to program my own wrapper in order to use it as a function to convert a measurement set file? I want to do this because I guess depending on the stage of the software that I'm currently building I will need different orderings (an specific ordering for imaging, gridding and de-gridding, and other ordering for calibration and self-cal). Since I know which ordering I need to use for each case, I would really like to call convert inside my code as a function rather than using os.system
. I was wondering if you could please help with that.
@sjperkins Hi again! quick question - How can I use convert from a piece of code directly with the
Convert
class and without usingos.system
? or would I need to program my own wrapper in order to use it as a function to convert a measurement set file? I want to do this because I guess depending on the stage of the software that I'm currently building I will need different orderings (an specific ordering for imaging, gridding and de-gridding, and other ordering for calibration and self-cal). Since I know which ordering I need to use for each case, I would really like to call convert inside my code as a function rather than usingos.system
. I was wondering if you could please help with that.
I'd just instantiate Convert with the relevant command line arguments and a python logger. Something like the following (I haven't run this!)
import logging
log = logging.getLogger(__file__)
args = ["input.ms", "--output", "output.ms", "--group-cols", "FIELD_ID,DATA_DESC_ID", "--index-cols", "TIME,ANTENNA1,ANTENNA2"]
convert = Convert(args, log)
convert.execute()
I'm getting this error when running the command @sjperkins :
2022-11-02 10:34:29,585 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21032324 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,592 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21037373 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,597 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21037373 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,602 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21042422 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,607 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21042422 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,613 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21047471 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,617 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21047471 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,624 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21088424 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,629 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21088424 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,634 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21092912 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,640 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21092912 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,646 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21097400 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,651 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21097400 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,657 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21101888 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,661 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21101888 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,666 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21160232 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,672 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21160232 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,678 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21164720 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,683 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21164720 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,689 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21169208 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,694 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21169208 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,699 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21173696 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,704 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21173696 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,711 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21232040 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,716 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21232040 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,721 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21236528 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,726 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21236528 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,732 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21241016 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,736 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21241016 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,743 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21245504 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,748 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21245504 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,753 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21303848 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,758 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21303848 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,764 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21308336 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,769 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21308336 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,775 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21312824 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,780 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21312824 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,786 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21317312 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,792 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21317312 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:31,247 - dask-ms - INFO - Input: 'measurementset' file:///home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg
2022-11-02 10:34:31,247 - dask-ms - INFO - Output: 'measurementset' file:///home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg_time
2022-11-02 10:35:21,005 - dask-ms - WARNING - The shape of column 'ASSOC_SPW_ID' is unconstrained (ndim == -1). Assuming shape is (31,) from exemplar
2022-11-02 10:35:21,011 - dask-ms - WARNING - The shape of column 'ASSOC_NATURE' is unconstrained (ndim == -1). Assuming shape is (31,) from exemplar
2022-11-02 10:35:21,015 - dask-ms - WARNING - The shape of column 'ASSOC_SPW_ID' is unconstrained (ndim == -1). Assuming shape is (31,) from exemplar
2022-11-02 10:35:21,021 - dask-ms - WARNING - The shape of column 'ASSOC_NATURE' is unconstrained (ndim == -1). Assuming shape is (31,) from exemplar
2022-11-02 10:35:21,025 - dask-ms - WARNING - The shape of column 'ASSOC_SPW_ID' is unconstrained (ndim == -1). Assuming shape is (31,) from exemplar
2022-11-02 10:35:21,031 - dask-ms - WARNING - The shape of column 'ASSOC_NATURE' is unconstrained (ndim == -1). Assuming shape is (31,) from exemplar
2022-11-02 10:35:21,036 - dask-ms - WARNING - The shape of column 'ASSOC_SPW_ID' is unconstrained (ndim == -1). Assuming shape is (31,) from exemplar
2022-11-02 10:35:21,042 - dask-ms - WARNING - The shape of column 'ASSOC_NATURE' is unconstrained (ndim == -1). Assuming shape is (31,) from exemplar
2022-11-02 10:35:21,521 - dask-ms - WARNING - Ignoring SOURCE
2022-11-02 10:35:21,525 - dask-ms - WARNING - Ignoring 'TARGET': Unable to infer shape of column 'TARGET' due to:
'TableProxy::getCell: no such row'
2022-11-02 10:35:21,526 - dask-ms - WARNING - Ignoring 'ENCODER': Unable to infer shape of column 'ENCODER' due to:
'TableProxy::getCell: no such row'
2022-11-02 10:35:21,527 - dask-ms - WARNING - Ignoring 'POINTING_OFFSET': Unable to infer shape of column 'POINTING_OFFSET' due to:
'TableProxy::getCell: no such row'
2022-11-02 10:35:21,527 - dask-ms - WARNING - Ignoring 'DIRECTION': Unable to infer shape of column 'DIRECTION' due to:
'TableProxy::getCell: no such row'
Traceback (most recent call last):
File "/home/vicente/anaconda3/envs/pyralysis2/bin/dask-ms", line 8, in <module>
sys.exit(main())
File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/daskms/apps/entrypoint.py", line 9, in main
return EntryPoint(sys.argv[1:]).execute()
File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/daskms/apps/entrypoint.py", line 33, in execute
cmd.execute()
File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/daskms/apps/convert.py", line 415, in execute
writes = self.convert_table(self.args)
File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/daskms/apps/convert.py", line 500, in convert_table
writes.append(writer(datasets, out_store))
File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/daskms/dask_ms.py", line 102, in xds_to_table
out_ds = write_datasets(
File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/daskms/writes.py", line 760, in write_datasets
tp = _updated_table(table, datasets, columns, descriptor)
File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/daskms/writes.py", line 338, in _updated_table
table_proxy.addcols(_table_desc, dminfo=_dminfo).result()
File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/concurrent/futures/_base.py", line 444, in result
return self.__get_result()
File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
raise self._exception
File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/daskms/table_proxy.py", line 114, in _impl
return getattr(table, method)(*args, **kwargs)
File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/casacore/tables/table.py", line 1226, in addcols
self._addcols(tdesc, dminfo, addtoparent)
RuntimeError: Invalid Table operation: Data manager name StandardStMan is already used in table /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg_time/POINTING
I'm worried that this is not working and that casang can re-order their xarray dataset by (time,baseline). I have noticed that this has a very high impact at least for self-calibration.
Actually I think this ordering is possibly only good for calibration itself. For imaging one would need to repack by baseline x time instead (like wsclean does when it reorders by w or when ddfacet computes bda ordering). Typically imaging takes a lot longer than the calibration routines so I wonder if it should not be packed like that instead?
@bennahugo Yes, ordering time, baseline is only good for calibration. For imaging the best ordering is baseline, time. I agree. Here is where self-cal enters and it needs both ordering - time, baseline when calibrating and baseline, time when imaging. Given that I'm developing software that will do both, my idea would be to re-order the dataset given what the code is doing (calibration, imaging, self-cal (needs both)). However, the convert script is not able to do that as you can see above, so I haven't been able to test anything at the moment.
I'm getting this error when running the command @sjperkins :
2022-11-02 10:34:29,585 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21032324 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18' 2022-11-02 10:34:29,592 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21037373 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19' 2022-11-02 10:34:29,597 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21037373 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18' 2022-11-02 10:34:29,602 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21042422 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19' 2022-11-02 10:34:29,607 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21042422 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18' 2022-11-02 10:34:29,613 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21047471 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19' 2022-11-02 10:34:29,617 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21047471 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18' 2022-11-02 10:34:29,624 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21088424 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19' 2022-11-02 10:34:29,629 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21088424 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18' 2022-11-02 10:34:29,634 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21092912 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19' 2022-11-02 10:34:29,640 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21092912 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18' 2022-11-02 10:34:29,646 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21097400 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19' 2022-11-02 10:34:29,651 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21097400 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18' 2022-11-02 10:34:29,657 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21101888 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19' 2022-11-02 10:34:29,661 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21101888 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18' 2022-11-02 10:34:29,666 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21160232 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19' 2022-11-02 10:34:29,672 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21160232 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18' 2022-11-02 10:34:29,678 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21164720 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19' 2022-11-02 10:34:29,683 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21164720 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18' 2022-11-02 10:34:29,689 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21169208 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19' 2022-11-02 10:34:29,694 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21169208 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18' 2022-11-02 10:34:29,699 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21173696 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19' 2022-11-02 10:34:29,704 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21173696 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18' 2022-11-02 10:34:29,711 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21232040 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19' 2022-11-02 10:34:29,716 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21232040 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18' 2022-11-02 10:34:29,721 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21236528 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19' 2022-11-02 10:34:29,726 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21236528 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18' 2022-11-02 10:34:29,732 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21241016 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19' 2022-11-02 10:34:29,736 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21241016 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18' 2022-11-02 10:34:29,743 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21245504 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19' 2022-11-02 10:34:29,748 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21245504 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18' 2022-11-02 10:34:29,753 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21303848 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19' 2022-11-02 10:34:29,758 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21303848 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18' 2022-11-02 10:34:29,764 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21308336 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19' 2022-11-02 10:34:29,769 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21308336 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18' 2022-11-02 10:34:29,775 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21312824 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19' 2022-11-02 10:34:29,780 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21312824 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18' 2022-11-02 10:34:29,786 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21317312 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19' 2022-11-02 10:34:29,792 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to: 'Table DataManager error: Invalid operation: TSM: no array in row 21317312 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18' 2022-11-02 10:34:31,247 - dask-ms - INFO - Input: 'measurementset' file:///home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg 2022-11-02 10:34:31,247 - dask-ms - INFO - Output: 'measurementset' file:///home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg_time 2022-11-02 10:35:21,005 - dask-ms - WARNING - The shape of column 'ASSOC_SPW_ID' is unconstrained (ndim == -1). Assuming shape is (31,) from exemplar 2022-11-02 10:35:21,011 - dask-ms - WARNING - The shape of column 'ASSOC_NATURE' is unconstrained (ndim == -1). Assuming shape is (31,) from exemplar 2022-11-02 10:35:21,015 - dask-ms - WARNING - The shape of column 'ASSOC_SPW_ID' is unconstrained (ndim == -1). Assuming shape is (31,) from exemplar 2022-11-02 10:35:21,021 - dask-ms - WARNING - The shape of column 'ASSOC_NATURE' is unconstrained (ndim == -1). Assuming shape is (31,) from exemplar 2022-11-02 10:35:21,025 - dask-ms - WARNING - The shape of column 'ASSOC_SPW_ID' is unconstrained (ndim == -1). Assuming shape is (31,) from exemplar 2022-11-02 10:35:21,031 - dask-ms - WARNING - The shape of column 'ASSOC_NATURE' is unconstrained (ndim == -1). Assuming shape is (31,) from exemplar 2022-11-02 10:35:21,036 - dask-ms - WARNING - The shape of column 'ASSOC_SPW_ID' is unconstrained (ndim == -1). Assuming shape is (31,) from exemplar 2022-11-02 10:35:21,042 - dask-ms - WARNING - The shape of column 'ASSOC_NATURE' is unconstrained (ndim == -1). Assuming shape is (31,) from exemplar 2022-11-02 10:35:21,521 - dask-ms - WARNING - Ignoring SOURCE 2022-11-02 10:35:21,525 - dask-ms - WARNING - Ignoring 'TARGET': Unable to infer shape of column 'TARGET' due to: 'TableProxy::getCell: no such row' 2022-11-02 10:35:21,526 - dask-ms - WARNING - Ignoring 'ENCODER': Unable to infer shape of column 'ENCODER' due to: 'TableProxy::getCell: no such row' 2022-11-02 10:35:21,527 - dask-ms - WARNING - Ignoring 'POINTING_OFFSET': Unable to infer shape of column 'POINTING_OFFSET' due to: 'TableProxy::getCell: no such row' 2022-11-02 10:35:21,527 - dask-ms - WARNING - Ignoring 'DIRECTION': Unable to infer shape of column 'DIRECTION' due to: 'TableProxy::getCell: no such row' Traceback (most recent call last): File "/home/vicente/anaconda3/envs/pyralysis2/bin/dask-ms", line 8, in <module> sys.exit(main()) File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/daskms/apps/entrypoint.py", line 9, in main return EntryPoint(sys.argv[1:]).execute() File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/daskms/apps/entrypoint.py", line 33, in execute cmd.execute() File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/daskms/apps/convert.py", line 415, in execute writes = self.convert_table(self.args) File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/daskms/apps/convert.py", line 500, in convert_table writes.append(writer(datasets, out_store)) File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/daskms/dask_ms.py", line 102, in xds_to_table out_ds = write_datasets( File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/daskms/writes.py", line 760, in write_datasets tp = _updated_table(table, datasets, columns, descriptor) File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/daskms/writes.py", line 338, in _updated_table table_proxy.addcols(_table_desc, dminfo=_dminfo).result() File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/concurrent/futures/_base.py", line 444, in result return self.__get_result() File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result raise self._exception File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/concurrent/futures/thread.py", line 57, in run result = self.fn(*self.args, **self.kwargs) File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/daskms/table_proxy.py", line 114, in _impl return getattr(table, method)(*args, **kwargs) File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/casacore/tables/table.py", line 1226, in addcols self._addcols(tdesc, dminfo, addtoparent) RuntimeError: Invalid Table operation: Data manager name StandardStMan is already used in table /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg_time/POINTING
I can't tell exactly what's happening from your stack trace. Which command line arguments are you using?
It looks like you're writing to an existing table due to the call to _updated_table
? This probably won't work. The --force
argument will remove any exiting output dataset.
I'm worried that this is not working and that casang can re-order their xarray dataset by (time,baseline). I have noticed that this has a very high impact at least for self-calibration.
As discussed earlier in https://github.com/ratt-ru/dask-ms/issues/159#issuecomment-1180430109, we don't impose specific orderings on data because different applications benefit from different orderings.
It's the user's responsibility to reorder their dataset into a format that is convenient for their application. This is possible via dask-ms convert
although this is still undocumented: #226.
@sjperkins
Maybe if I add the link to the ms here you can traceback the error?
The command line that I'm currently using is:
dask-ms convert HLTau_B6cont.calavg.tav300s -g "FIELD_ID,DATA_DESC_ID,SCAN_NUMBER" -i "ANTENNA1,ANTENNA2,TIME,FEED1,FEED2" -o output.ms --format ms --force
I'm not creating any folder before that.
@sjperkins
Maybe if I add the link to the ms here you can traceback the error?
The command line that I'm currently using is:
dask-ms convert HLTau_B6cont.calavg.tav300s -g "FIELD_ID,DATA_DESC_ID,SCAN_NUMBER" -i "ANTENNA1,ANTENNA2,TIME,FEED1,FEED2" -o output.ms --format ms --force
I'm not creating any folder before that.
Thanks for the linked MS.
I can reproduce this error on my side. I'll try block off some time to look at the issue this week.
Description
Hello everyone, I would like to partition or group my ms dataset based on FIELD_ID, DATA_DESC_ID and BASELINE (which is not a column, but can be calculated using ANTENNA1 and ANTENNA2). It is possible to do this? Also, for each of the baselines I would like to get the length of them. However, for that we would need to do a query for the entire dataset instead of the list of partitions.
Anyone know how to do this?
This library is awesome, keep the good work, best regards!