An explanation of the next steps in producing the training images.
To train a convolutional neural network we need a set of “inputs”. Each “input" is an 3-dimensional array of shape (w, h, c) containing images of size (w, h), in each of c “channels”. In a normal colour image the channels might be red, blue and green. In our case, we have more channels available: each of the filters in which the galaxies are observed.
The Sérsic profile code you have gives you a way of generating a variety of vaguely realistic galaxy images. Each one is normalised in some arbitrary manner (in the gist example the normalisation is such that the brightest pixel has unit value). The code generates n galaxies, so produces an array of shape (n, w, h). The full set of “inputs” to our network will be an array of shape (g, w, h, c). For simplicity, in the following I'll assume g = n. To keep things straightforward, we'll leave the non-SED redshift effects to Elizabeth's code.
In addition, you have simulated galaxy SEDs. The variable called flux0 in your code is a (r, c, m) array containing a set of m SEDs (normalised in some other arbitrary manner), observed in c filters, at r different redshifts.
Let’s start with some simple assumptions:
(1) each galaxy has a constant stellar population,
(2) all galaxies have the same intrinsic brightness (or peak brightness),
(3) we will input galaxies that are all at the same redshift, z_in
(4) we want our network to produce images of the galaxies as they would appear at a fixed higher redshift, z_out
Given (1) and (2), we can just multiply a galaxy image by each the SED flux in each filter to get a set of c “channel” images. (From our perspective the normalisation is arbitrary. In principle, one could produce correctly scaled images with correspondingly accurate noise. However, it would require additional technical work and specific assumptions about the observations. Here, we are only interested in proof-of-concept. Furthermore, in machine learning it is common to normalise inputs, so the overall scaling is unimportant. To get the final images at a reasonable level we can just apply an overall scaling factor at the end.)
We need to select an SED for each of the n galaxies. A straightforward way to do this (randomly) is:
From (3) we can just take SEDs at a single low redshift (avoid exactly z=0 as this has potential complications), so for your current setup, let’s say z=0.1 (in future you could generate more redshifts), i.e. z_in_idx = 1. So, gal_seds_in = gal_seds[z_in_idx], which is of shape (c, n).
We have a (n, w, h) array of images, let’s call this gal_images. If we do
gal_images = gal_images[…, None]
it adds an extra axis to the array, so gal_images now has shape (n, w, h, 1).
We now need to make the shape of gal_seds compatible:
gal_seds = gal_seds.T[:, None, None, :]
which turns it into an array of shape (n, 1, 1, c)
Now we can multiply them:
inputs = gal_images * gal_seds
producing the (n, w, h, c) array we are after.
To train our network we also need a set of n “targets”. These take the same form as the inputs, but should be what we want the network to output for each given input. Assuming (4), we can just do exactly the same thing as for the inputs, but with a different redshift index, say z_out_idx = 5 for an output redshift of z=0.5.
With these assumptions our trained network will be rather limited, so the next steps will be to first relax (4), to have a variety of output redshifts. We can then relax (3), to have a variety of input redshifts (where z_out > z_in).
To do this, just select the redshifts randomly, e.g. z_in_idx = np.random.choice(r - 1, size=n) and z_out_idx = z_in_idx + np.random.choice(r - z_in_idx).
You will need to save z_in and z_out as these will be required as additional inputs to the network.
To put your final input and target arrays on roughly the correct scale, I'd scale them such that the maximum pixel value is, say, 10000. i.e.
You'd then save these arrays to disk (e.g. np.save) ready for processing with Elizabeth's code.
Note that, at this stage, the images include bandpass redshifting effects, as well as surface brightness dimming, as this is included in the SEDs. They do not include angular scale redshifting effects. This would be applied by Elizabeth's code, prior to PSF convolution and addition of noise. (An alternative I previously mentioned is that you could create the images with sizes that include the angular scaling. However, that would make the above a bit more complicated, so I suggest keeping this as a separate step in Elizabeth's code.)
(Note that I haven’t actually run any of the code snippets above, so I may have an occasional bug, but the ideas should be sound.)
Just to give a bit of workload context, writing the above (including thinking about the various issues, looking at smpy code to check things, and editing for clarity) took 2-3 hours.
An explanation of the next steps in producing the training images.
To train a convolutional neural network we need a set of “inputs”. Each “input" is an 3-dimensional array of shape
(w, h, c)
containing images of size(w, h)
, in each ofc
“channels”. In a normal colour image the channels might be red, blue and green. In our case, we have more channels available: each of the filters in which the galaxies are observed.The Sérsic profile code you have gives you a way of generating a variety of vaguely realistic galaxy images. Each one is normalised in some arbitrary manner (in the gist example the normalisation is such that the brightest pixel has unit value). The code generates
n
galaxies, so produces an array of shape(n, w, h)
. The full set of “inputs” to our network will be an array of shape(g, w, h, c)
. For simplicity, in the following I'll assumeg
=n
. To keep things straightforward, we'll leave the non-SED redshift effects to Elizabeth's code.In addition, you have simulated galaxy SEDs. The variable called
flux0
in your code is a(r, c, m)
array containing a set ofm
SEDs (normalised in some other arbitrary manner), observed inc
filters, atr
different redshifts.Let’s start with some simple assumptions: (1) each galaxy has a constant stellar population, (2) all galaxies have the same intrinsic brightness (or peak brightness), (3) we will input galaxies that are all at the same redshift,
z_in
(4) we want our network to produce images of the galaxies as they would appear at a fixed higher redshift,z_out
Given (1) and (2), we can just multiply a galaxy image by each the SED flux in each filter to get a set of
c
“channel” images. (From our perspective the normalisation is arbitrary. In principle, one could produce correctly scaled images with correspondingly accurate noise. However, it would require additional technical work and specific assumptions about the observations. Here, we are only interested in proof-of-concept. Furthermore, in machine learning it is common to normalise inputs, so the overall scaling is unimportant. To get the final images at a reasonable level we can just apply an overall scaling factor at the end.)We need to select an SED for each of the
n
galaxies. A straightforward way to do this (randomly) is:From (3) we can just take SEDs at a single low redshift (avoid exactly z=0 as this has potential complications), so for your current setup, let’s say z=0.1 (in future you could generate more redshifts), i.e.
z_in_idx = 1
. So,gal_seds_in = gal_seds[z_in_idx]
, which is of shape(c, n)
.To accomplish the multiplication of each galaxy by its corresponding SED we can make use of array “broadcasting”. See https://numpy.org/doc/stable/user/basics.broadcasting.html. To make this work we need to do the following.
We have a
(n, w, h)
array of images, let’s call thisgal_images
. If we doit adds an extra axis to the array, so
gal_images
now has shape(n, w, h, 1)
.We now need to make the shape of
gal_seds
compatible:which turns it into an array of shape
(n, 1, 1, c)
Now we can multiply them:
producing the
(n, w, h, c)
array we are after.To train our network we also need a set of
n
“targets”. These take the same form as the inputs, but should be what we want the network to output for each given input. Assuming (4), we can just do exactly the same thing as for the inputs, but with a different redshift index, sayz_out_idx = 5
for an output redshift of z=0.5.With these assumptions our trained network will be rather limited, so the next steps will be to first relax (4), to have a variety of output redshifts. We can then relax (3), to have a variety of input redshifts (where z_out > z_in).
To do this, just select the redshifts randomly, e.g.
z_in_idx = np.random.choice(r - 1, size=n)
andz_out_idx = z_in_idx + np.random.choice(r - z_in_idx)
.You will need to save
z_in
andz_out
as these will be required as additional inputs to the network.To put your final input and target arrays on roughly the correct scale, I'd scale them such that the maximum pixel value is, say, 10000. i.e.
You'd then save these arrays to disk (e.g.
np.save
) ready for processing with Elizabeth's code.Note that, at this stage, the images include bandpass redshifting effects, as well as surface brightness dimming, as this is included in the SEDs. They do not include angular scale redshifting effects. This would be applied by Elizabeth's code, prior to PSF convolution and addition of noise. (An alternative I previously mentioned is that you could create the images with sizes that include the angular scaling. However, that would make the above a bit more complicated, so I suggest keeping this as a separate step in Elizabeth's code.)
(Note that I haven’t actually run any of the code snippets above, so I may have an occasional bug, but the ideas should be sound.)