Major updates on the image deconvolution modules

hiyoneda commented 3 weeks ago

I have updated the image deconvolution modules. It changes the overall structure of the image deconvolution, which is a major update.

The basic idea is to separate deconvolution algorithms from the actual data format and the definition of a model to be reconstructed. Now, the deconvolution algorithm, data, and model have their corresponding base classes (DeconvolutionAlgorithmBase, DataLoaderBase, and ModelBase, respectively), and actual implementation should be done by inheriting a corresponding base class. Then, the ImageDeconvoltuion class uses these subclasses and performs image deconvolution.

The advantage of these updates is that collaborators can focus on a key part of computations without considering the other classes. For example, if one wants to improve the computational speed by nicely handling the response function, one should only create a new class of the data loader and override methods related to the response handling, like expected count calculation. Each base class has methods that must be overridden in subclasses. Using only these methods, ImageDeconvoltuion can perform image deconvolution without knowing the details of the data itself.

This update also includes the following changes:

The image deconvolution can be performed using different datasets. For example, if the response function is changed for some reason, e.g., a part of the Ge detector becomes out of work during satellite operation, we need to use a different response function before/after such a change. Now, the code can handle multiple datasets and perform the image deconvoltion using them simultaneously.
The optimization of normalizations of multiple background components. The background models should be set to the dataloader as a Python dictionary. The RL algorithm can optimize their normalization by referring to keys in the background dictionary, even using multiple datasets.

Screenshot 2024-06-11 at 16 46 19 Screenshot 2024-06-11 at 16 47 01

hiyoneda commented 3 weeks ago

I am going to update the notebooks related to the image deconvolution.

hiyoneda commented 3 weeks ago

I've updated the notebooks. Also, in this PR, I replaced print with logger.info regarding the issue #48. Now, this PR is ready to be reviewed.

codecov[bot] commented 3 weeks ago

Codecov Report

Attention: Patch coverage is 25.69444% with 428 lines in your changes missing coverage. Please review.

Project coverage is 37.26%. Comparing base (65706f7) to head (e0571bb).

:exclamation: Current head e0571bb differs from pull request most recent head 13e13c8

Please upload reports for the commit 13e13c8 to get more accurate results.

Files	Coverage Δ
cosipy/image_deconvolution/__init__.py	`100.00% <100.00%> (ø)`
.../image_deconvolution/coordsys_conversion_matrix.py	`18.47% <ø> (+0.19%)`	:arrow_up:
cosipy/threeml/COSILike.py	`20.12% <50.00%> (ø)`
cosipy/image_deconvolution/model_base.py	`61.90% <61.90%> (ø)`
cosipy/image_deconvolution/data_loader_base.py	`57.50% <57.50%> (ø)`
cosipy/image_deconvolution/allskyimage.py	`36.73% <36.73%> (ø)`
cosipy/image_deconvolution/RichardsonLucySimple.py	`32.07% <32.07%> (ø)`
...mage_deconvolution/deconvolution_algorithm_base.py	`36.14% <31.57%> (+16.33%)`	:arrow_up:
cosipy/image_deconvolution/image_deconvolution.py	`28.57% <20.22%> (+0.40%)`	:arrow_up:
cosipy/image_deconvolution/RichardsonLucy.py	`16.12% <10.57%> (-2.48%)`	:arrow_down:
... and 1 more

israelmcmc commented 1 week ago

Thanks for working on this, @hiyoneda! I haven't gone through the code yet, just the description. Here are some initial thoughts and questions:

I like the idea of separating the different parts of the code. It will be very helpful both to optimize it and to add new features.
Instead of the exposure map, can we have the SpacecraftFile as an attribute of the base DataLoader class? It is more general, and the exposure map can be computed on the fly from there if needed. Some algorithms won't have exposure as a function of the model space --i.e. as a function of RA/Dec.
Note that in order for the codecov check to pass, you need to have at least 50% of the new code tested. It says "target 100%", but I currently set a 50% tolerance while we get used to it.

hiyoneda commented 1 week ago

Thanks! @israelmcmc

For the second point, I intended to use exposure_map as an intermediate matrix required by the RL algorithm. When we update the image, we need to calculate $\sum{i} R{ij}$ in the M-step of the RL algorithm. The exposure_map is that. I want to pre-calculate it because $\sum{i} R{ij}$ will be used every iteration, and calculating it is often computationally heavy. I agree that exposure_map can be calculated from SpacecraftFile and Response. So they might be attributes of a subclass, but I want exposure_map as the base class's attribute because it is mandatory in any case.

And I will also make unit_test to achieve the 50% goal. 50%, not 100%, is great (as for now).

hiyoneda commented 1 week ago

I attached figures to explain about the correspondence between the RL algorithm and the new functions.

Screenshot 2024-06-24 at 14 47 07

israelmcmc commented 1 week ago

Thanks for the figure, @hirokiyoneda. It's clearer now. On second thought, I agree it's fine to have the exposure_map in the base class. Most of the time we'll need it, and if for some reason you don't, we can replace it with a property that constructs and returns a vector on the fly.

hiyoneda commented 2 days ago

I renamed DataLoader to DataDeconvolutionAlgorithmInterface. The subclass will be named DataIF_AAA. For instance, one corresponding to DC2 is named as DataIF_COSI_DC2.

hiyoneda commented 2 days ago

I changed the class name to ImageDeconvolutionDataInterfaceBase. Sorry if it is confusing...

hiyoneda commented 1 day ago

@israelmcmc @ckarwin I am preparing unit_test for these new classes. I wanted to use data_pl.h5/bkg_pl.h5, but their bins differ from the response test_full_detector_response.h5. For example, their energy axes are: event : [ 100. 200. 500. 1000. 2000. 5000.] keV response : [ 150. 220. 325. 480. 520. 765. 1120. 1650. 2350. 3450. 5000.]

Can I replace data_pl.h5/bkg_pl.h5 with new ones with the same axes as the response?

israelmcmc commented 1 day ago

@hiyoneda That seems fine. As far as I can tell, none of the current tests or tutorials use that file.

cositools / cosipy

Major updates on the image deconvolution modules #188

Codecov Report