Closed fred3m closed 2 years ago
Sounds cool! Are there going to be unit tests for "lite" to assess the performance compared to "heavy" scarlet?
Yep, I ran it on all of the usual scarlet tests to compare the results. I can post some plots tomorrow.
Sounds good. However, am I right that the unit tests currently do not cover lite?
That's correct.
Couple more errors in displaying
File "/home/esheldon/git/scarlet/scarlet/display.py", line 241, in show_observation
center_ = observation.get_pixel(center)
AttributeError: 'LiteObservation' object has no attribute 'get_pixel'
Also the LiteBlend is not iterable, so this example from the documentation no longer works
for k, src in enumerate(blend):
if hasattr(src, "center"):
y, x = src.center
ax[0].text(x, y, k, color="w")
ax[1].text(x, y, k, color="w")
ax[2].text(x, y, k, color="w")
I noticed that "skipped" is no longer returned, are sources no longer skipped?
Here is a comparison with and without scarlet.lite.weight_sources
. It looks like the overall normalization is off
Note these images are produced by metacalibration, so there is significant correlated noise.
e_rel = 1.0e-2
No weight_sources
With weight_sources
Same but for tolerance e_rel=1.0e-4
No weight sources
With weight_sources
I noticed that "skipped" is no longer returned, are sources no longer skipped?
Lite sources have an is_null
property, so skipped
can easily be obtained by [src for src in sources if src.is_null]
, so it didn't seem necessary to me to keep track of it like we did before. Note: the skipped
return value was included because we didn't always have a scarlet.NullSource
, so it was necessary to re-insert sources after deblending to keep source indices the same before and after running scarlet. We really could/should have removed it in scarlet main once NullSource
was created.
Should I be ignoring sources that are is_null
?
Couple more errors in displaying
Thanks for pointing these out. As Peter said, this isn't meant as a drop in replacement and the more I think about it the more I feel like scarlet.lite.display should be used instead of trying to mangle the data types to work with display in main scarlet. So I'll probably re-introduce similar functions in scarlet.lite.display so that the easy solution will be to use from scarlet.lite import display
, which should make code calling the display functions remain valid.
@esheldon I'm curious about your residuals when weighting sources. Is this because of the correlated noise that is added? The weighted templates have zero residuals with the input (by design), so any bias in those "models" is due to improper normalization (zero point?) of the input data, right?
Thanks for pointing these out. As Peter said, this isn't meant as a drop in replacement and the more I think about it the more I feel like scarlet.lite.display should be used instead of trying to mangle the data types to work with display in main scarlet. So I'll probably re-introduce similar functions in scarlet.lite.display so that the easy solution will be to use
from scarlet.lite import display
, which should make code calling the display functions remain valid.
I see, thanks. In that case, I suppose some of the compatibility changes could be backed out if you like.
@esheldon I'm curious about your residuals when weighting sources. Is this because of the correlated noise that is added? The weighted templates have zero residuals with the input (by design), so any bias in those "models" is due to improper normalization (zero point?) of the input data, right?
the input data have correlated noise, but zero background. How would the correlated noise cause an overall change in normalization?
Should I be ignoring sources that are
is_null
?
In the sense that any source that is_null
doesn't have any components (that's the definition of the is_null
property) so it will return a zero model.
@esheldon I'm curious about your residuals when weighting sources. Is this because of the correlated noise that is added? The weighted templates have zero residuals with the input (by design), so any bias in those "models" is due to improper normalization (zero point?) of the input data, right?
the input data have correlated noise, but zero background. How would the correlated noise cause an overall change in normalization?
I'm not sure, I'm just trying to understand where your "residual" value comes from. When I calculate residuals I get all zeros, since the input data and output model with all of the sources made by redistributing the flux are the same.
Here is an example without correlated noise that still shows the big residuals
No reweighting, e_rel=1.0e-4
With reweighting, e_rel=1.0e-4
It could be a bug in how I'm running the code. Here is what I'm doing for the measurement and visualization
https://gist.github.com/esheldon/2161c619f4a86b797408bada6222f320
Notice how the model has some low frequency "noise" in it for the reweighted case but not the regular case.
Views of the models with and without reweighting.
No reweighting
With reweighting
Note it looks like the show_sources
code is not passing on use_flux
when show_rendered=True
. You can see this in the "Model Source 1 Rendered" panel above which does not show the low frequency noise
line 84
model_ = src.get_model(bbox=bbox)
Ohhhhhh, I see the problem. When you re-weight the images they are already convolved, so the extra convolution that you do (observation.render(model)
) performs an extra convolution, hence the change in sign. You should only render the model in the observed frame if flux is not used.
How should I be rendering the models?
Oh, I didn't realize that get_model
returned a plain array. I guess you are saying I should not then call render.
And I see now why you prefer the name "convolve" to "render". Based on the name I assumed that the model had not been realized in an array before calling render.
source.flux
is already rendered. To do the reweighting it convolves the model for each source and uses the ratio of source.get_model
templates for each source to redistribute the flux in the full model, storing that in source.flux
. To do it all in one step you can change line model = blend.get_model(use_flux=reweight, convolve=not reweight)
, which will perform a convolution if not re-distributing the flux and just return the re-weighted models if you are. With this replacement you can just remove line 8 entirely.
And I see now why you prefer the name "convolve" to "render". Based on the name I assumed that the model had not been realized in an array before calling render.
Yes, this is one of the big differences in scarlet lite, where rendering only involves convolving the model. In main scarlet, because the model can be much more complicated, involving different resolution, image orientations, and spectral components, rendering was designed to be a much more complicated feature.
Ok, I think that I addressed everything.
Here are my comments on the multiscale_deblending
notebook:
StarletConvolution
, I don't understand why you do this, but I recognise that it works really well. It seems to me that you convolve by an extra spline to smooth things out, but where does that come from and what am I missing?.
Ok. I think I'll move the first example into the scarlet lite notebook and just remove the this notebook from the scarlet docs. @pmelchior, is there anything else that you'd like to see me do before merging?
Let' me go through your responses in code. I like many things in the PR, but I have two bigger conceptual problems with lite
, which I like to address:
lite
and main
is not clear. The way it's done in this PR will merge a quite distinct branch into main
, so that one could think of using lite
classes in main
methods (although they at least have names like Lite...
), or even worse main
classes in lite
methods (because they don't carry lite
in their names). The tutorial notebook that demonstrates the use of lite
as part of the main
docs makes this confusion even worse. Some detection methods and or constraints could/should be used in main
, but I can't tell which ones and how exactly.I wanted to justify the architecture of scarlet lite by demonstrating how easy it is to change parts of it, like the blend or source class, to quickly test new algorithms and ideas, which is something that takes a lot more work in standard scarlet.
That, to me, is a very different argument for lite
than the one we started from, namely getting autograd
out of the code and streamline it for bulk processing. There's a lot of scope creep because of this decision. Moreover, the users who care about speed in bulk processing are not usually the ones who test new algorithms. In fact, I'd recommend that main
remains more experimental (as autodiff makes it more capable in general) compared to lite
.
Just to address your comments from Friday:
lite
notebook does use some methods (namely detection methods) that we don't use anywhere else, but those are from a completely separate effort than the lite effort, just improvements needed for other things that I'm working on. Including them in the lite notebook allows us to make sure that they don't bitrot.Ideally I would like to see the architecture of scarlet match the lite branch, but I don't have the time to put in the effort required to do that. But the architecture needed to change anyway, since a lot of the design that went into scarlet main was to handle things like rendering models and matching observations in both spatial and spectral dimensions, which are both trivial in the lite module. I think that it's clear now that the main branch doesn't need anything from the lite branch, however lite
does make use of classes and functions from scarlet main. I understand your argument about liking autograd to calculate analytic gradients but TBH I'm not sure that it's necessary. We don't use any methods that we couldn't write gradients for ourselves, but if we really wanted to be lazy we could implement a grad
method for a LiteComponent
that imports autograd and automatically calculates the gradients for a class (I have a notebook somewhere that does that and it's fairly trivial to implement, and also allows users to use pytorch or jax).
To me the difference between lite
and main
is that main
does not make the assumption that the input observations have been spectrally and spatially aligned prior to running scarlet while lite
does. In that sense we could rename scarlet.lite
branch to scarlet.aligned
or scarlet.reprojected
(to use the nomenclature common in astronomy) to make it clear where one branch should be used vs the other.
I'm OK with the changes and this PR overall. But the style of lite
, and its restriction to the aligned case (that surveys generally have at least for coadds) suggests to me that lite
becomes its own package (scarlet-lite
or so). Thinking this through, there are two options:
scarlet-lite
imports, and thus depends on, scarlet
for e.g. its initialization and constraint implementationscarlet-lite
forks from the merged code after this PR and strips out everything it doesn't need to form a standalone implementation of the factorized model for aligned data.From a maintenance perspective I'd prefer option 2, even if there's some code duplication. It also has the advantage that scarlet
can carry out the planned transition to pytorch
to streamline and speed up the multi-resolution operations. At this point, we'd have to maintain CPU and GPU code in one repo (e.g. for FFTs, padding, monotonicity, etc), which doesn't sound like fun to me. A CPU-only scarlet-lite
would be cleaner. And lite
is likely to have a more stable API, which makes it easier for Rubin to work with it.
Creates an autogradless version of scarlet along with a choice of optimizer, new initialization schemes, and other utility functions for simple use cases when the models are simple and the gradients are known.