Open b-remy opened 3 years ago
This is super cool! Hum, maybe to start we can fix a few things, like I
but let the morphology (i.e. size and sersic) vary?
For the PSF model, we can use a fixed COSMOS PSF for instance. You can extract it from GalSim by doing:
psf = cat.makeGalaxy(2, gal_type='real', noise_pad_size=0).original_psf
this gives you a GalSim object representing the PSF, that you can then draw on a stamp. For the noise, you can use a constant noise level for now, with sigma=0.003 I think this is more or less the level of noise in COSMOS images... If I recall correctly...
Ah also, for reference, the COSMOS pixel size is 0.03 arcsec
Do you know that you can easily check how GalSim draws a given parametric profile?
cat = galsim.COSMOSCatalog()
# getting the galaxies
gal_param = cat.makeGalaxy(2, gal_type='parametric')
gal_real = cat.makeGalaxy(2, gal_type='real')
# convolving them by PSF
g_param = galsim.Convolve(gal_param, gal_real.original_psf)
g_real = galsim.Convolve(gal_real, gal_real.original_psf)
# Plotting
subplot(121)
imshow(g_param.drawImage(nx=64, ny=64, scale=0.03).array)
subplot(122)
imshow(g_real.drawImage(nx=64, ny=64, scale=0.03).array)
Thanks @EiffL for these advices!
Just to make sure about the level of noise in COSMOS images, here is the mean and std pixel values in images that do not contain galaxies (from real_galaxy_catalog_25.2.fits
catalog):
with mean standard deviation of 0.0029
ah, I remembered correctly ^^' Right, so this is a bit hand wavy, but probably good enough for now, we can use sigma=0.003. In real life, to determine the noise you need to know the sky brightness, instrument size, and exposure time.
I started to build a toy model according to the description above in toymodel1.py.
However, I am wondering if there is something wrong when I look at the simulations:
The galaxies are not very visible, as if the noise was too strong...
The edward
settings are:
sigma_e = 0.003
noise = ed.Normal(loc=tf.zeros((nx, ny)), scale=sigma_e)
# prior on Sersic index n
log_l_n = ed.Normal(loc=.1, scale=.39)
n = tf.math.exp(log_l_n * _log10)
# prior on Sersic size half light radius
log_l_hlr = ed.Normal(loc=-.68, scale=.3)
hlr = tf.math.exp(log_l_hlr * _log10)
Very very cool! But ok I think that makes sense, and probably it's my bad, but at fixed total flux, if the size of the galaxy increases, the image becomes dimer.
No my bad actually ^^', I forgot to specify the flux (which was 1. by default)....... the correct plot is:
Since I used a constant Flux we can still observe what you mentioned, i.e. increasing the size reduces the intensity.
With shear, PSF and noise it gives:
I used the arcsinh(x/sigma)*sigma
to saturate the image and help the visualisation, and a fixed COSMOS PSF as you proposed.
So far the model is in the following order:
oh wow, tu cartones! It's looking great :-) when you say shear here, is it gravitational shear or intrinsic galaxy ellipticity?
I actually wanted to do shearing but I didn't think a lot about it. Please tell me what you think about this:
After creating the profile, I can do a step of shearing to simulate intrinsic gal ellipticity, sampling e1, e2 from N(0, 0.28) (as estimated in the notebook above).
Then for actual shear, should I apply the same value to all the galaxies, assuming that they all belong to the same bin? Since I have images of 5x5 stamps, I can estimate shear with averages of 25 ellipticities in the COSMOS catalog and thus following N(0., 0.05).
yep, trying to infer a constant shear is not a bad first step! Maybe let's see if we can run some inference in this case to get started?
Where things become really fun though is where you try to estimate a spatially varying shear. We could for instance start with a simple model where we model the convergence as Gaussian Process, I think that's what Michael was doing. And then we could upgrade it with our score matching model :-)
Just sharing a small trick for reshaping a batch of images into a 2d image ;-)
ims = model(16, 64)
res = ims.numpy().reshape(4,4,64,64).transpose([0,2,1,3]).reshape([4*64,4*64])
imshow(res, cmap='gray_r')
Another trick :-) You can tweak your model function so that it outputs the likelihood, which you can then directly use to compute the joint likeilhood, here is what I mean:
def model(batch_size, stamp_size):
[....]
# Convolve the image with the PSF
profile = galflow.convolve(ims, kpsf,
zero_padding_factor=padding_factor,
interp_factor=interp_factor)[...,0]
# Returns likelihood
return ed.Normal(loc=profile, scale=sigma_e, name="obs")
This way you can for instance do things like model().distribution.log_pob(data)
, even better if you can conditioned the model on some specific parameter values.
For computing the joint likelihood, with a PPL you would typically do it in 2 steps:
In Edward2 check that API here: https://github.com/google/edward2/blob/e75d4db9f9a7dfedbbd0746ca64fdfdac2d0f399/edward2/tracers.py#L31
I think we can do something like:
with ed.condition(obs=ims):
log_prob = ed.make_log_joint_fn(model)
hummmm this conditionning things isnt working as I expected... but anyway you can do this instead:
log_prob = ed.make_log_joint_fn(model)
log_prob(n=true_params['n'],
hlr=true_params['hlr'],
gamma=true_params['gamma'],
obs=ims)
assuming you have named all variables in your forward model.
Sooooo cool :-D Here is a little example of running a chain, just leaving the ellipticity as a free param, and starting chain at 0 ellipticity for all the objectst:
Full notebook here
But there seems to be an issue with either hlr or n. If I sample many images some of them end up with NaNs, and then in the HMC chain it blows up. It's usually the sersic index that causes issues in this sort of things but havent figured out exactly where these nans are coming from
I noticed the same thing later today after opening https://github.com/DifferentiableUniverseInitiative/GalFlow/pull/15... I have to do more tests to figure out because I thought that I handled all the NaNs.... I also have to enable @tf.function
using tf.while_loop
inside calculate_b(n)
function.
But this is a very cool demonstration !!
yep, just updated it to include a constant shear + intrinsic galaxy ellipticity. It's a bit slow to sample because not tf.functioned but "l'idee y est" :-)
And now with chain and posterior on shear: https://colab.research.google.com/drive/1hyZBlG5KhNHF1rCx3U7BFUVOkZ75e0CR?usp=sharing
Hi there, Very nice results! I am wondering if you have investigate why the estimation on g2 is much noisier than g1? Do you think this is just a coincidence?
Also by checking your example I have noticed that you are using sersic profile, right? If this is the case you might consider using a "simpler" profile just for testing given that the implementation of this profile in galsim is "slow" if you change the sersic index a lot (Check this issue for more info https://github.com/GalSim-developers/GalSim/issues/566). By taking a De Vaucouleur or Exponential you might speed up your tests..
This issue is to document the choice of prior parameters in our model.
So far, we assume the following PGM:
Thus, we need to set prior on galaxy size, shape noise and shear field.
We can use the COSMOS catalog from GalSim to estimate such parameters. In the following notebook, I fitted distributions from the COSMOS catalog and the outcomes are:
https://gist.github.com/b-remy/bd7acd3255b223004dc517462a10489d
To build Sersic light profile with GalFlow, we need to specify the Flux alongside the size and sersic index. In the notebook, I show how the Flux can be computed from hlr, ellipticity and intensity.
To compute shear distribution, I averaged ellipticities by bins of 10 and fitted the histogram of binned shapes. Thus,
It still remains to create a PSF model (that is fixed, no random variable).
According to the PGM above and the notebook fits, should we split
lambda
in three parametershlr
,n
andI
? or should we only keephlr
varying?