Adds a new method read_col_conversion_dask that allows larger than memory columns to be converted. Various changes:
xarray DataSet encoding has been cleaned up and adjusted to ignore DataArrays that are dask arrays
lofar and lofar_read_size arguments added to convert_msv2_to_processing_set
TableManager class has been added so that multi-thread/process conversion can happen without having to serialize casacore table objects. This replaces open_table_ro and open_query in convert_and_write_partition
read_col_conversion_dask uses dask's map_blocks to create tasks for each chunk of a DataArray which reads data from a MSv2 column and reshapes it
This has been used to convert 9TB of lofar data in ~4.5 hours which was previously impossible unless a compute node with >9TB of memory is used
Adds a new method
read_col_conversion_dask
that allows larger than memory columns to be converted. Various changes:lofar
andlofar_read_size
arguments added toconvert_msv2_to_processing_set
open_table_ro
andopen_query
inconvert_and_write_partition
read_col_conversion_dask
uses dask's map_blocks to create tasks for each chunk of a DataArray which reads data from a MSv2 column and reshapes itThis has been used to convert 9TB of lofar data in ~4.5 hours which was previously impossible unless a compute node with >9TB of memory is used