JLBLine / WODEN

CUDA code designed to simulate interferometric data using point, Gaussian, and shapelet sky models
https://woden.readthedocs.io/en/latest/
Mozilla Public License 2.0
5 stars 0 forks source link

2. Read polarisation info from FITS into `wodenpy` #57

Closed JLBLine closed 1 month ago

JLBLine commented 1 month ago

Sigh. Unless stated, all references to Python code in this section are in wodenpy.skymodel. Makes things easier to read.

The major order of sky model things that happens in run_woden.py is as follows:

  1. Use read_skymodel.read_radec_count_components to count how many components of each type we have in the sky model, as well as the RA/Dec positions, by creating and filling a woden_skymodel.Component_Type_Counter object
  2. Use woden_skymodel.crop_below_horizon to crop out stuff below the horizon; results are fed into the Component_Type_Counter
  3. Feed the Component_Type_Counter into chunk_sky_model.create_skymodel_chunk_map, which returns a list of Components_Map objects. Each of these maps contains all the indexes of components from the sky model to make a chunk that will fit into GPU memory
  4. This list of Components_Map objects is then fed into the run_woden.read_skymodel_thread thread, which reads in a number of chunks of the sky model, sends it to the GPU (which will iterate over however many chunks were sent across), and then reads the next set of chunks while the GPU is running the last set.
  5. run_woden.read_skymodel_thread calls read_skymodel.read_skymodel_chunks to read in the sky model chunks. This in turn calls read_fits_skymodel.read_fits_skymodel_chunks, which sets up a wodenpy.use_libwoden.skymodel_struct.Source_Catalogue_Double/Float class. This is filled with a "Source" for every "chunk", and all the "components" associated with that chunk are populated into that "Source". This Source_Catalogue_Double/Float is a ctypes object that is passed directly to the GPU code.

The difficult part here is the "chunking" of the sky model, which not only splits the sky model up into smaller chunks for the GPU to handle, but also reorders by POINT/GAUSSIAN/SHAPELET, as well as by Stokes I power-law/curved power-law/list-type. We'll need to setup ways to track linear and circular polarisation along with the power-law/curved power-law/list-type. That means editing the things listed below.

At the end result that goes into the GPU code, we have the wodenpy.use_libwoden.skymodel_struct.Components_Double/Float class, which are attributes of the wodenpy.use_libwoden.skymodel_struct.Source_Double/Float class (three of them, one each for POINT, GAUSSIAN, and SHAPELET components). Components_Double/Float class contains power_comp_inds curve_comp_inds list_comp_inds arrays, which index the flux model info to the extrap_stokesI (and RA/DEC/Beam) information. I think we need to add the following arrays to the Components_Double/Float class:

Maybe rename the assumed Stokes I arrays to be clear:

At the same time, we should delete:

as these are wrong and should never have existed. I imagine we'll need to add something like

as we'll need those to tell the GPU which kernels to launch.

To be able to populate these new Components_Double arrays, we need to edit the following functions/classes, for each step listed above.

read_fits_sky_model.read_fits_radec_count_components counts how much of everything we have in the sky model. Bear minimum this needs to check for how many linear and/or circular polarisation components we have. We'll need to add some attributes to the wodenpy_skymodel.Component_Type_Counter class to track these. Now, Component_Type_Counter is setup to still be counting things iteratively, as if it was reading things in from a yaml or text file. I could take this opportunity to strip out this code and stream line, as I convert yaml into FITS internally at the moment (which is hella inefficient, should maybe change that also).

Will need to add something like the below, as we have to separate out the POINT,GAUSSIAN,SHAPELET types:

and a number of associated edits to get read_fits_radec_count_components to populate these new arrays. Once again, we should probably rename the assumed Stokes I arrays to be clear:

The when we pass that to create_skymodel_chunk_map, we need to add the new arrays to the Components_Map class, which is populated from the Component_Type_Counter class. We'll need something like:

and should probably rename the assumed Stokes I arrays to be clear:

Right, so create_skymodel_chunk_map calls map_chunk_pointgauss and map_chunk_shapelets. The former calls in turn call increment_flux_type_counters, which does some insane logic to split things by flux type. map_chunk_shapelets is different, as it splits things by shapelet coefficient rather than flux type.

This is where it gets tricky. Every polarisation component is associated with some Stokes I component, so we don't need to add any extra logic to split things by polarisation model type. What we do need is some logic in chunk_sky_model.fill_chunk_component, that matches any polarisation information we have with the relevant component. Gonna need some serious logic and tested there, with many new variables and bits of np.where methinks. But this is where we populate the new arrays in the Components_Map class.

Once that crucial bit of mapping logic is correct, we then need to edit read_fits_skymodel.add_fits_info_to_source_catalogue to use the new arrays in the Components_Map to correctly populate the new arrays in Components_Double/Float, the things that get passed to the GPU code.

Testing need to update cmake_testing/wodenpy/skymodel/test_read_FITS_skymodel_chunk.py to check all these news values are read in, and update cmake_testing/wodenpy/skymodel/common_skymodel_test.py and cmake_testing/wodenpy/skymodel/read_skymodel_common.py with new polarisation options. Should probably write a new test test_read_yaml_radec_count_components.py as we don't explicitly test that at the moment (it's tested within test_read_FITS_skymodel_chunk.py). Need to make sure a variety of different

JLBLine commented 1 month ago

The part above that says _"Now, Component_TypeCounter is setup to still be counting things iteratively, as if it was reading things in from a yaml or text file. I could take this opportunity to strip out this code and stream line." is wrong. I still use those text file style things in read_yaml_radec_count_components, so leave that code in there for now

JLBLine commented 1 month ago

In ddf2b32ec903a0a4b395096534fe6b2fa9619c42 I have modified read_fits_radec_count_components to count and index polarisation info from the FITS table

JLBLine commented 1 month ago

In 2cb43ec059aa9078ab5a82a6500f855bf13461f9 enabled create_skymodel_chunk_map to handle polarisation information. Tested by updated and adding to tests in cmake_testing/wodenpy/skymodel

JLBLine commented 1 month ago

OK, read_skymodel_chunk can read polarisation info from a FITS file as of c8688f8f30b01c75ba08e8548b6f4f457e3d35fd. Tests have been embellished/updated in cmake_testing/wodenpy/skymodel