radio-astro-tools / casa-formats-io

Code to handle I/O from/to data in CASA format
Other
10 stars 7 forks source link

CASATable not working for dataset #46

Open miguelcarcamov opened 2 years ago

miguelcarcamov commented 2 years ago

I'm working with this PDS70 dataset. I am running this lines:

rslt = CASATable.read(ms_name)
tables = rslt.as_astropy_table(data_desc_id="all")

And I'm getting the following error:

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
Input In [10], in <cell line: 2>()
      1 rslt = CASATable.read(ms_name)
----> 2 tables = rslt.as_astropy_table(data_desc_id="all")

File ~/Documents/casa-formats-io/casa_formats_io/casa_low_level_io/table.py:365, in CASATable.as_astropy_table(self, data_desc_id, include_columns)
    362         colindex_in_dm += 1
    364 if hasattr(dm, 'read_column'):
--> 365     coldata = dm.read_column(self._filename, seqnr, self.column_set.columns[colindex], coldesc[colindex], colindex_in_dm)
    366     if coldata is not None:
    367         table_columns[colname] = coldata

File ~/Documents/casa-formats-io/casa_formats_io/casa_low_level_io/data_managers/standard.py:201, in StandardStMan.read_column(self, filename, seqnr, column, coldesc, colindex_in_dm)
    199 for irow in range(rows_in_bucket[bucket_id]):
    200     offset = read_int64(f)
--> 201     fi.seek(offset)
    202     ndim = read_int32(fi)
    203     subshape = []

File ~/Documents/casa-formats-io/casa_formats_io/casa_low_level_io/core.py:47, in EndianAwareFileHandle.seek(self, n)
     46 def seek(self, n):
---> 47     return self.file_handle.seek(n)

OSError: [Errno 22] Invalid argument
keflavich commented 2 years ago

I can reproduce this:

import casa_formats_io
from astropy.table import Table
ms_name = 'residuals.ms'
rslt = Table.read(ms_name)
tables = rslt.as_astropy_table(data_desc_id="all")

yields

Traceback (most recent call last):
  File "<ipython-input-5-e56083fa4cb2>", line 1, in <module>
    rslt = Table.read(ms_name)
  File "/home/adam/anaconda3/lib/python3.8/site-packages/astropy/table/connect.py", line 62, in __call__
    out = self.registry.read(cls, *args, **kwargs)
  File "/home/adam/anaconda3/lib/python3.8/site-packages/astropy/io/registry/core.py", line 199, in read
    data = reader(*args, **kwargs)
  File "/home/adam/repos/casa-formats-io/casa_formats_io/table_reader.py", line 18, in read_casa_table
    return table.as_astropy_table(data_desc_id=data_desc_id)
  File "/home/adam/repos/casa-formats-io/casa_formats_io/casa_low_level_io/table.py", line 365, in as_astropy_table
    coldata = dm.read_column(self._filename, seqnr, self.column_set.columns[colindex], coldesc[colindex], colindex_in_dm)
  File "/home/adam/repos/casa-formats-io/casa_formats_io/casa_low_level_io/data_managers/standard.py", line 201, in read_column
    fi.seek(offset)
  File "/home/adam/repos/casa-formats-io/casa_formats_io/casa_low_level_io/core.py", line 47, in seek
    return self.file_handle.seek(n)
OSError: [Errno 22] Invalid argument

n is -1:

ipdb> print(n)
-1
ipdb> self.file_handle.tell()
12

@astrofrog any chance you can help figure this out?

keflavich commented 2 years ago

@miguelcarcamov can you say anything more about how this MS was made? I can't tell yet whether this is a bug we have or a not-yet-supported file type, but it woudl be helpful to know where this comes from

miguelcarcamov commented 2 years ago

Yes @keflavich . The measurement set comes from the residuals of gpuvmem (an image reconstruction software), at this point the only thing that gpuvmem does is just change the data column and add a model_data column. The original dataset comes from the ALMA 2019 Long Baseline data of PDS70 (paper here). I can confirm that the data can be read with casa and also can be read with dask-ms.

miguelcarcamov commented 2 years ago

Also note that in the POLARIZATION table there are two rows. One of the rows has only 1 correlation and is never used in the DATA_DESCRIPTION table.

astrofrog commented 2 years ago

I can try and look at this soon