aidenrolfe / ARG

Artificially Redshifting Galaxies with a neural network - MSc Physics Research Project
9 stars 5 forks source link

Inputting galaxies into VAE #12

Closed llpalethorpe closed 3 years ago

llpalethorpe commented 3 years ago

Just wanted to clarify one more thing, because z_in (0.849) and z_out (1.2) are both just one number and it won’t let me concatenate them because they’re zero-dimensional arrays. Do I need need to duplicate them so there are len(gal_input)*z_in and len(gal_target)*z_out and then concatenate these into 1 array and then perform the train_test_split procedure? I attempted this but then got the error ValueError: Found input variables with inconsistent numbers of samples: [100, 100, 200], because z_input and z_target are both 100 each so when they're concatenated it then gives 200, but gal_input and gal_target are still both only 100 each. I did a fix on this by halving it so z_input and z_target are 50 each but know that's probably not right, although the code does run and stopped training after 40 epochs. However, the reconstructions didn't work and neither did the latent space plot so I'm currently looking into those, but would be grateful if you could help with the first issue. The code is on devLarissa GalaxiesVAE.py

bamford commented 3 years ago

(1) The redshift arrays that @aidenrolfe's galaxies.py is currently producing are not correct. They need to contain a value for each galaxy, even if the value for every galaxy is currently the same (as it will soon not be). However, for now it is ok for you to work around this by duplicating the values.

(2) You need to concatenate correctly, which is why in my email I said "look at the shape of y for guidance". The inputs to tensorflow always have the example index first, so in MNIST y_train has the shape (60000, 10). You now have 2 conditions instead of 10, and 100 examples*, so your redshifts_train needs to have shape (100, 2). The quickest way to get that is redshifts = np.transpose([z_inputs, z_targets]).

llpalethorpe commented 3 years ago

The code now runs and I (think) I've managed to fix the reconstructions, although they're not great but I'm just testing with 100 epochs and 100 galaxies which I assume is nowhere near long enough with nowhere near a large enough dataset to properly train. Figure_42 I'm just having trouble with the latent plot, specifically with c argument being inconsistent with the size of x and y, so I just wanted to check I'm assigning the correct variable to it, I've been using redshifts_test is that right?