ratt-ru / dask-ms

Implementation of a dask/xarray dataset backed by a CASA MS
https://dask-ms.readthedocs.io
Other
19 stars 7 forks source link

xds_from_table #153

Open hrkloeck opened 3 years ago

hrkloeck commented 3 years ago

Hi there, I'm just starting to play with daskms and I run into an error if I do the following (if I use e.g. FIELD that works). I'm currently working on a multi-source MS dataset.

xds_from_table(MSFN+'::SOURCE',group_cols='row')

ValueError: No known conversion from CASA Table type 'record' to python/numpy type. Perhaps it needs to be added to _TABLE_TO_PY?: {'BOOL': 'bool', 'BOOLEAN': 'bool', 'BYTE': 'uint8', 'COMPLEX': 'complex64', 'DCOMPLEX': 'complex128', 'DOUBLE': 'float64', 'FCOMPLEX': 'complex64', 'FLOAT': 'float32', 'INT': 'int32', 'INTEGER': 'int32', 'SHORT': 'int16', 'SMALLINT': 'int16', 'STRING': 'object', 'UCHAR': 'uint8', 'UINT': 'uint32', 'UINTEGER': 'uint32', 'USHORT': 'uint16', 'USMALLINT': 'uint16'}

sjperkins commented 3 years ago

Hi there, I'm just starting to play with daskms and I run into an error if I do the following (if I use e.g. FIELD that works). I'm currently working on a multi-source MS dataset.

xds_from_table(MSFN+'::SOURCE',group_cols='row')

ValueError: No known conversion from CASA Table type 'record' to python/numpy type. Perhaps it needs to be added to _TABLE_TO_PY?: {'BOOL': 'bool', 'BOOLEAN': 'bool', 'BYTE': 'uint8', 'COMPLEX': 'complex64', 'DCOMPLEX': 'complex128', 'DOUBLE': 'float64', 'FCOMPLEX': 'complex64', 'FLOAT': 'float32', 'INT': 'int32', 'INTEGER': 'int32', 'SHORT': 'int16', 'SMALLINT': 'int16', 'STRING': 'object', 'UCHAR': 'uint8', 'UINT': 'uint32', 'UINTEGER': 'uint32', 'USHORT': 'uint16', 'USMALLINT': 'uint16'}

Thanks for the report @hrkloeck. I note that the in the definition of the SOURCE table, the optional SOURCE_MODEL column has a TableRecord type. I've never encountered an MS with this type of data in it, but it is still a valid case. dask-ms attempts to reproduce CASA columns as dask arrays and TableRecords are dictionaries, so some thought would be required to support this edge case.

Questions:

  1. Does your SOURCE table have a SOURCE_MODEL column in it?
  2. Do you need to access this column?
  3. If not, could you specify only the columns you are interested in via the columns keyword argument?

Does (3) solve your problem?

hrkloeck commented 3 years ago

Hi Simon,

thank’s for coming back. 

Q1: I don’t know I need to check (via CASA I guess) Q2: Not necessary yet, but maybe in the future. I just have started the dask-ms and CASA-MS endeavour, coming from AIPS and ParselTongue. Q3: ddids,cckey = xds_from_table(MSFN+'::SOURCE',column_keywords=True) have tried this but get the same error.

So I guess there is no urgency to this, currently my first approach is to extract the information out of the data to get a first understanding and how that all works. However since the MS definition https://casa.nrao.edu/casadocs/casa-5.1.1/reference-material/measurement-set https://casa.nrao.edu/casadocs/casa-5.1.1/reference-material/measurement-set mentioned the SOURCE table I thought to be consistent it might be useful to raise it.

Just for your info, my idea is that I would like to write something to plot the DATA and DATA_MODEL versus time for a given baseline to understand the modelling (coming from VLBI) to learn dask-ms. I know of Oleg’s shadeMS (which I really like), but this is slightly different and I want to learn how to do it.

Cheers, Hans


Hans-Rainer Klöckner

Max-Planck-Institut für Radioastronomie Auf dem Hügel 69 53121 Bonn Germany +49 (0)228 525-338 @.***

On 19. Mar 2021, at 15:02, Simon Perkins @.***> wrote:

Hi there, I'm just starting to play with daskms and I run into an error if I do the following (if I use e.g. FIELD that works). I'm currently working on a multi-source MS dataset.

xds_from_table(MSFN+'::SOURCE',group_cols='row')

ValueError: No known conversion from CASA Table type 'record' to python/numpy type. Perhaps it needs to be added to _TABLE_TO_PY?: {'BOOL': 'bool', 'BOOLEAN': 'bool', 'BYTE': 'uint8', 'COMPLEX': 'complex64', 'DCOMPLEX': 'complex128', 'DOUBLE': 'float64', 'FCOMPLEX': 'complex64', 'FLOAT': 'float32', 'INT': 'int32', 'INTEGER': 'int32', 'SHORT': 'int16', 'SMALLINT': 'int16', 'STRING': 'object', 'UCHAR': 'uint8', 'UINT': 'uint32', 'UINTEGER': 'uint32', 'USHORT': 'uint16', 'USMALLINT': 'uint16'}

Thanks for the report @hrkloeck https://github.com/hrkloeck. I note that the in the definition of the SOURCE table https://casa.nrao.edu/Memos/229.html#SECTION000615000000000000000, the optional SOURCE_MODEL column has a TableRecord type. I've never encountered an MS with this type of data in it, but it is still a valid case. dask-ms attempts to reproduce CASA columns as dask arrays and TableRecords are dictionaries, so some thought would be required to support this edge case.

Questions:

Does your SOURCE table have a SOURCE_MODEL column in it? Do you need to access this column? If not, could you specify only the columns you are interested in via the columns keyword argument? Does (3) solve your problem?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ska-sa/dask-ms/issues/153#issuecomment-802855666, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALCUM5N7MSEFW62VQ6AP3U3TENKOXANCNFSM4ZOZNK3Q.

sjperkins commented 3 years ago

Hi Simon, thank’s for coming back. Q1: I don’t know I need to check (via CASA I guess) Q2: Not necessary yet, but maybe in the future. I just have started the dask-ms and CASA-MS endeavour, coming from AIPS and ParselTongue. Q3: ddids,cckey = xds_from_table(MSFN+'::SOURCE',column_keywords=True) have tried this but get the same error.

Ah should have mentioned the docs, I meant something like:

xds_from_table("BLAH.MS::SOURCE", columns=["TIME", "INTERVAL", "NAME", "CODE"])

So I guess there is no urgency to this, currently my first approach is to extract the information out of the data to get a first understanding and how that all works. However since the MS definition https://casa.nrao.edu/casadocs/casa-5.1.1/reference-material/measurement-set https://casa.nrao.edu/casadocs/casa-5.1.1/reference-material/measurement-set mentioned the SOURCE table I thought to be consistent it might be useful to raise it.

Yes, this is definitely something to keep track of, so I'll keep this issue open for now. I would bump the priority if the SOURCE_MODEL column was important to a number of people.