Build a Catalog class to deal with everything related to catalog call. The catalog will be handle through dask. This will require to write high-level method to make it easier to use and more transparent for the user.
Using dask will allow to use large catalogs transparently. Also, dask does more than catalog handling. The computation on the catalog are done either on a LocalCluster or a user-defined one. Some computations are automatically done in parallel. For more specific things it will be necessary to write wrapper around existing functions to make use of all the capability of dask.
The catalogs are read making use of the vaex library to handle .fits files. Also, vaex use dask under the hood which make this choice very easy. Also, vaex can open multiple catalogs at once as long as they follow the same format. We will make use of this feature.
Note: by using dask all the computation are "lazy". That means that until you call .compute() method nothing actually happen. Only the tree of the computation is done.
[x] Read catalog using vaex
[x] Allow the convertion to .hdf5 format to allow memory mapping (not possible from .fits format)
[x] Store the converted catalog in the workspace directory
[ ] Make it possible to remove the converted file to save space on disk
[x] Keep track of the catalog of origin for each objects
[x] Read catalog from config file
[ ] Make it possible to instantiate a catalog from a path or a list of path
[x] Make a parent class
[x] Build a children class for galaxy catalog
[x] Build a children class for star catalog
[ ] Handle star catalogs from MCCD (-> convert global positions to local, maybe this should be done in MCCD)
Build a Catalog class to deal with everything related to catalog call. The catalog will be handle through
dask
. This will require to write high-level method to make it easier to use and more transparent for the user. Usingdask
will allow to use large catalogs transparently. Also,dask
does more than catalog handling. The computation on the catalog are done either on aLocalCluster
or a user-defined one. Some computations are automatically done in parallel. For more specific things it will be necessary to write wrapper around existing functions to make use of all the capability ofdask
. The catalogs are read making use of thevaex
library to handle.fits
files. Also,vaex
usedask
under the hood which make this choice very easy. Also,vaex
can open multiple catalogs at once as long as they follow the same format. We will make use of this feature.Note: by using
dask
all the computation are "lazy". That means that until you call.compute()
method nothing actually happen. Only the tree of the computation is done.vaex
.hdf5
format to allow memory mapping (not possible from.fits
format)