pmelchior / scarlet

hyperspectral galaxy modeling and deblending
MIT License
50 stars 22 forks source link

Lite #257

Closed fred3m closed 2 years ago

fred3m commented 2 years ago

Creates an autogradless version of scarlet along with a choice of optimizer, new initialization schemes, and other utility functions for simple use cases when the models are simple and the gradients are known.

esheldon commented 2 years ago

Sounds cool! Are there going to be unit tests for "lite" to assess the performance compared to "heavy" scarlet?

fred3m commented 2 years ago

Yep, I ran it on all of the usual scarlet tests to compare the results. I can post some plots tomorrow.

esheldon commented 2 years ago

Sounds good. However, am I right that the unit tests currently do not cover lite?

fred3m commented 2 years ago

That's correct.

esheldon commented 2 years ago

Couple more errors in displaying

  File "/home/esheldon/git/scarlet/scarlet/display.py", line 241, in show_observation
    center_ = observation.get_pixel(center)
AttributeError: 'LiteObservation' object has no attribute 'get_pixel'

Also the LiteBlend is not iterable, so this example from the documentation no longer works

    for k, src in enumerate(blend):
        if hasattr(src, "center"):
            y, x = src.center
            ax[0].text(x, y, k, color="w")
            ax[1].text(x, y, k, color="w")
            ax[2].text(x, y, k, color="w")
esheldon commented 2 years ago

I noticed that "skipped" is no longer returned, are sources no longer skipped?

esheldon commented 2 years ago

Here is a comparison with and without scarlet.lite.weight_sources. It looks like the overall normalization is off

Note these images are produced by metacalibration, so there is significant correlated noise.

e_rel = 1.0e-2 No weight_sources tol2 With weight_sources tol2-reweight

Same but for tolerance e_rel=1.0e-4 No weight sources tol4 With weight_sources tol4-reweight

fred3m commented 2 years ago

I noticed that "skipped" is no longer returned, are sources no longer skipped?

Lite sources have an is_null property, so skipped can easily be obtained by [src for src in sources if src.is_null], so it didn't seem necessary to me to keep track of it like we did before. Note: the skipped return value was included because we didn't always have a scarlet.NullSource, so it was necessary to re-insert sources after deblending to keep source indices the same before and after running scarlet. We really could/should have removed it in scarlet main once NullSource was created.

esheldon commented 2 years ago

Should I be ignoring sources that are is_null ?

fred3m commented 2 years ago

Couple more errors in displaying

Thanks for pointing these out. As Peter said, this isn't meant as a drop in replacement and the more I think about it the more I feel like scarlet.lite.display should be used instead of trying to mangle the data types to work with display in main scarlet. So I'll probably re-introduce similar functions in scarlet.lite.display so that the easy solution will be to use from scarlet.lite import display, which should make code calling the display functions remain valid.

fred3m commented 2 years ago

@esheldon I'm curious about your residuals when weighting sources. Is this because of the correlated noise that is added? The weighted templates have zero residuals with the input (by design), so any bias in those "models" is due to improper normalization (zero point?) of the input data, right?

esheldon commented 2 years ago

Thanks for pointing these out. As Peter said, this isn't meant as a drop in replacement and the more I think about it the more I feel like scarlet.lite.display should be used instead of trying to mangle the data types to work with display in main scarlet. So I'll probably re-introduce similar functions in scarlet.lite.display so that the easy solution will be to use from scarlet.lite import display, which should make code calling the display functions remain valid.

I see, thanks. In that case, I suppose some of the compatibility changes could be backed out if you like.

esheldon commented 2 years ago

@esheldon I'm curious about your residuals when weighting sources. Is this because of the correlated noise that is added? The weighted templates have zero residuals with the input (by design), so any bias in those "models" is due to improper normalization (zero point?) of the input data, right?

the input data have correlated noise, but zero background. How would the correlated noise cause an overall change in normalization?

fred3m commented 2 years ago

Should I be ignoring sources that are is_null ?

In the sense that any source that is_null doesn't have any components (that's the definition of the is_null property) so it will return a zero model.

fred3m commented 2 years ago

@esheldon I'm curious about your residuals when weighting sources. Is this because of the correlated noise that is added? The weighted templates have zero residuals with the input (by design), so any bias in those "models" is due to improper normalization (zero point?) of the input data, right?

the input data have correlated noise, but zero background. How would the correlated noise cause an overall change in normalization?

I'm not sure, I'm just trying to understand where your "residual" value comes from. When I calculate residuals I get all zeros, since the input data and output model with all of the sources made by redistributing the flux are the same.

esheldon commented 2 years ago

Here is an example without correlated noise that still shows the big residuals

No reweighting, e_rel=1.0e-4 tol4

With reweighting, e_rel=1.0e-4 tol4-reweight

esheldon commented 2 years ago

It could be a bug in how I'm running the code. Here is what I'm doing for the measurement and visualization

https://gist.github.com/esheldon/2161c619f4a86b797408bada6222f320

esheldon commented 2 years ago

Notice how the model has some low frequency "noise" in it for the reweighted case but not the regular case.

esheldon commented 2 years ago

Views of the models with and without reweighting.

No reweighting

tol4-model

With reweighting

tol4-model-reweight

esheldon commented 2 years ago

Note it looks like the show_sources code is not passing on use_flux when show_rendered=True. You can see this in the "Model Source 1 Rendered" panel above which does not show the low frequency noise

line 84 model_ = src.get_model(bbox=bbox)

fred3m commented 2 years ago

Ohhhhhh, I see the problem. When you re-weight the images they are already convolved, so the extra convolution that you do (observation.render(model)) performs an extra convolution, hence the change in sign. You should only render the model in the observed frame if flux is not used.

esheldon commented 2 years ago

How should I be rendering the models?

esheldon commented 2 years ago

Oh, I didn't realize that get_model returned a plain array. I guess you are saying I should not then call render.

And I see now why you prefer the name "convolve" to "render". Based on the name I assumed that the model had not been realized in an array before calling render.

fred3m commented 2 years ago

source.flux is already rendered. To do the reweighting it convolves the model for each source and uses the ratio of source.get_model templates for each source to redistribute the flux in the full model, storing that in source.flux. To do it all in one step you can change line model = blend.get_model(use_flux=reweight, convolve=not reweight), which will perform a convolution if not re-distributing the flux and just return the re-weighted models if you are. With this replacement you can just remove line 8 entirely.

fred3m commented 2 years ago

And I see now why you prefer the name "convolve" to "render". Based on the name I assumed that the model had not been realized in an array before calling render.

Yes, this is one of the big differences in scarlet lite, where rendering only involves convolving the model. In main scarlet, because the model can be much more complicated, involving different resolution, image orientations, and spectral components, rendering was designed to be a much more complicated feature.

fred3m commented 2 years ago

Ok, I think that I addressed everything.

herjy commented 2 years ago

Here are my comments on the multiscale_deblending notebook:

fred3m commented 2 years ago
herjy commented 2 years ago
fred3m commented 2 years ago

Ok. I think I'll move the first example into the scarlet lite notebook and just remove the this notebook from the scarlet docs. @pmelchior, is there anything else that you'd like to see me do before merging?

pmelchior commented 2 years ago

Let' me go through your responses in code. I like many things in the PR, but I have two bigger conceptual problems with lite, which I like to address:

  1. The relation between lite and main is not clear. The way it's done in this PR will merge a quite distinct branch into main, so that one could think of using lite classes in main methods (although they at least have names like Lite...), or even worse main classes in lite methods (because they don't carry lite in their names). The tutorial notebook that demonstrates the use of lite as part of the main docs makes this confusion even worse. Some detection methods and or constraints could/should be used in main, but I can't tell which ones and how exactly.
  2. From your earlier comment:

    I wanted to justify the architecture of scarlet lite by demonstrating how easy it is to change parts of it, like the blend or source class, to quickly test new algorithms and ideas, which is something that takes a lot more work in standard scarlet.

That, to me, is a very different argument for lite than the one we started from, namely getting autograd out of the code and streamline it for bulk processing. There's a lot of scope creep because of this decision. Moreover, the users who care about speed in bulk processing are not usually the ones who test new algorithms. In fact, I'd recommend that main remains more experimental (as autodiff makes it more capable in general) compared to lite.

fred3m commented 2 years ago

Just to address your comments from Friday:

  1. There were very few changes to scarlet main in this PR. The lite notebook does use some methods (namely detection methods) that we don't use anywhere else, but those are from a completely separate effort than the lite effort, just improvements needed for other things that I'm working on. Including them in the lite notebook allows us to make sure that they don't bitrot.
  2. That was always part of my desire to move to lite. In addition to the runtime and memory issues with autograd, the code in the main branch uses a very different architecture that has made testing and developing new features cumbersome. The main problem for me is that the class hierarchy is basically all designed for initialization, since once the models are initialized nearly all of the source classes perform exactly the same. So while stripping out autograd I kept in mind all of the things that I have on my future plate to test: different optimization algorithms, wavelet updates, hierarchical cataloging, etc. Because of the way that we have organized the code, each one of those would require a completely new "branch" of scarlet, with custom classes from the top to bottom in order to take advantage of our current initialization.

Ideally I would like to see the architecture of scarlet match the lite branch, but I don't have the time to put in the effort required to do that. But the architecture needed to change anyway, since a lot of the design that went into scarlet main was to handle things like rendering models and matching observations in both spatial and spectral dimensions, which are both trivial in the lite module. I think that it's clear now that the main branch doesn't need anything from the lite branch, however lite does make use of classes and functions from scarlet main. I understand your argument about liking autograd to calculate analytic gradients but TBH I'm not sure that it's necessary. We don't use any methods that we couldn't write gradients for ourselves, but if we really wanted to be lazy we could implement a grad method for a LiteComponent that imports autograd and automatically calculates the gradients for a class (I have a notebook somewhere that does that and it's fairly trivial to implement, and also allows users to use pytorch or jax).

To me the difference between lite and main is that main does not make the assumption that the input observations have been spectrally and spatially aligned prior to running scarlet while lite does. In that sense we could rename scarlet.lite branch to scarlet.aligned or scarlet.reprojected (to use the nomenclature common in astronomy) to make it clear where one branch should be used vs the other.

pmelchior commented 2 years ago

I'm OK with the changes and this PR overall. But the style of lite, and its restriction to the aligned case (that surveys generally have at least for coadds) suggests to me that lite becomes its own package (scarlet-lite or so). Thinking this through, there are two options:

  1. scarlet-lite imports, and thus depends on, scarlet for e.g. its initialization and constraint implementation
  2. scarlet-lite forks from the merged code after this PR and strips out everything it doesn't need to form a standalone implementation of the factorized model for aligned data.

From a maintenance perspective I'd prefer option 2, even if there's some code duplication. It also has the advantage that scarlet can carry out the planned transition to pytorch to streamline and speed up the multi-resolution operations. At this point, we'd have to maintain CPU and GPU code in one repo (e.g. for FFTs, padding, monotonicity, etc), which doesn't sound like fun to me. A CPU-only scarlet-lite would be cleaner. And lite is likely to have a more stable API, which makes it easier for Rubin to work with it.