Autostronomy / AstroPhot

A fast, flexible, automated, and differentiable astronomical image 2D forward modelling tool for precise parallel multi-wavelength photometry
https://astrophot.readthedocs.io
GNU General Public License v3.0
79 stars 9 forks source link

Potential for confusion in node['flux'] #197

Open entaylor opened 1 month ago

entaylor commented 1 month ago

Writing this at the same time as Issue #196

Please can i check that what is extracted as node['flux'].value is in fact log(flux), and not flux? I think this is conveyed by the units, but not the name.

My suggestion is either to parameterise things in terms of flux and not logflux, or to rename 'flux' as 'logflux' to align with its true meaning.

For my own use cases, i think parameterising in terms of flux makes more sense than parameterising as logflux. While a 'true' value for flux is strictly positive, in the presence of errors a negative flux is entirely reasonable, and carries real information: a -1 or -2 or -3 sigma 'measurement' each has different implications for, say, SED modelling, in terms of what models they permit/exclude. I do see why logflux makes some sense in view of Poisson noise and the connection to astronomical magnitudes, but magnitudes are stupid anyway. : ) More than anything, i see the potential for this being a very large and painful change to the structure of the code.

Similarly, i see the potential for high costs to refactor the name of so critical a variable, but ... well, i hope you see the danger that i am trying to point to!

Thanks again!

ConnorStoneAstro commented 1 month ago

This is a fair point, and something that I have personally gone back and forth on a few times while developing the code. Ultimately, I went with log flux for two main reasons. One, I'm a Bayesian and my prior says that stars/galaxies can't have negative flux, so this is an extra constraint I can impose which is physically motivated. When working with very faint objects, working in log flux means that I can get a very small flux without worrying about it accidentally going negative. I think if your model needs a negative flux star, then probably there is something else wrong in the model, for example the sky level is too high and that needs to be fixed. Two, it's more numerically stable for astro objects which have such a large dynamic range, in the same image there can be objects with many orders of magnitude different total flux, but in log scale that's a relatively easy to represent.

I like your idea of renaming it logflux since that better represents what the parameter is. In fact the names of flux-related parameters has been a pain point for a lot of people. For example the sersic Ie parameter, which is actually log10(flux/arcsec^2) so one has to do pixelscale^2 * 10**(Ie) to get the real Ie that would go into a sersic profile function. I actually made a function model.total_flux() that can be called on any model which was intended to help bridge that annoying gap. But I don't think I've done a very good job in the tutorials of explaining it, so maybe I need a page on the website to explain this.

Renaming variables is relatively easy in the code. The main drawback is that people have already written their scripts using the current name, so I need to be careful about changing things unexpectedly. Still, I like the logflux idea, so perhaps I'll change that and a few other parameter names. I'd appreciate any other name confusions you've run into, maybe I'll collect a bunch of these and put them into a single backwards comparability breaking update.

entaylor commented 1 month ago

This actually connects weakly to the issue that posted today, but ...

... is there a way that you could specify the choice of prior?

I get your Bayesian point about the flux, but if you are interested in, say, stacking or SED modelling, then it is advantageous and Bayesian to allow for negative values for the observed flux.

Let me think about stacking first. Say i want to interrogate the FIR or radio properties of a sample of galaxies. Maybe from the optical or infrared i have a good prior for the shape of the underlying distribution. Knowing that most of my targets are weakly or undetected in the band of interest, i want to use the position and shape priors from a deeper band to get a(n biased) flux estimate for each object in the ensemble. Because the noise can be positive or negative, i know the observed value can be positive or negative, even if the true underlying value must be strictly positive. If i enforce a strictly positive prior then i know that i am much more likely to overestimate the flux for an object that has an undectectable amount of flux, and so i know that i will get a biased estimate of the mean when averaging over my larger sample.

The issue is similar but slightly different if i am SED modelling. If i am SED modelling, then the object of interest is the difference between a predicted and an observed SED. It is convenient to assume that those differences (let me use the word residuals) are normally, or at least symmetrically, distributed. If the PDF on the measured fluxes is bounded, then the statistics are much more expensive, if not impossible, to compute.

If we are already talking about a backwards incompatible refactor, then maybe its a good opportunity to allow various choices of prior on different parameters. This actually makes very good sense for size and shape parameters, where the Jeffreys prior is definitely non-linear. E.g. a log prior on sigma or a linear prior on ln(sigma) is actually 'the right' prior to use, according to Jeffreys. And i think the same for size?

Thinking further, arguably the 'right' prior for flux is something like arcsinh, which is ~linear near zero (where pixel noise dominates) and ~logarithmic far from zero (where shot noise and/or catastrophic errors dominate).

So maybe the proposal would be something like, when defining a model, to specify priors for parameters in a form like:

model = ap.models.AstroPhot_Model(
    name="constrained parameters",
    model_type="sersic galaxy model",
    parameters={
        "q": {"value": 0.8, "limits": [0.4, 0.6], prior = "cos"},
        "n": {"value":4, "limits": [0.5, 8], prior = 'log'},
        "Rd":{"value":5, "limits":[0, 10], prior = 'normal},
        "flux":{"value":100, prior="arcsinh"},
    },

Though i have no idea how easily one can incorporate priors into the pytorch framework!

ConnorStoneAstro commented 1 month ago

@entaylor oh I really like the idea of specifying the way that the parameters are handled (log, normal, arcsinh, etc) in the parameter specification! I'll have to think very carefully about how it is implemented and how realistic it is, but I think something at least generally along those lines is very doable :)

In the meantime, if you are stuck on a particular problem, we could just make a new class which treats the flux parameter linearly. If you make a class and have it inherit from one of the AstroPhot models then it will automatically get picked up by the system and you can use it just like any other.