VMBoehm / MnuLFI

throwing LFI on massiveNus sims
MIT License
3 stars 4 forks source link

cloud versus skewers #3

Open changhoonhahn opened 5 years ago

changhoonhahn commented 5 years ago
changhoonhahn commented 5 years ago

@viajani pointed out that that data I used in the skewer vs cloud calculations are means over 100 realizations. Although it's unlikely to change the results significantly, calculation should be updated to use the means over the full set of realizations.

@viajani also pointed out that my emulated peak counts are negative, which is unphysical. Impose non-negativity

changhoonhahn commented 5 years ago

mean peak counts of full sims at 101 parameter points : data_scaled.full.npy shape=(101,50) corresponding parameter values: params_conc.full.npy shape=(101,3)

@viajani these are the correct files, right?

viajani commented 5 years ago

@changhoonhahn actually no, I just checked and 'data_scaled.full.npy' that we computed at the workshop is the same as 'data_scaled_100subsamples_means.npy'

I pulled a request with the correct file 'data_scaled_full_set_means.npy' shape=(101,50) which are the means peak counts of full sims at 101 parameter points and updated the data_description_folder adding the description for this new file.

Here's how I verified that,

if you want to double check

import numpy as np

data_sim=np.array([np.load('%s_Peaks_KN_s2.00_z1.00_ng40.00_b050.npy'%(ifn))[2:] for ifn in fn])

#LSST Scaling LSST_scale=np.sqrt(12.25/20000)

y_full_set_coll=[] params_coll=[] for i in range(len(params)):

mu_peaks=np.mean(data_sim[i,:,:],axis=0)

y_full_set = (data_sim[i,:,:]-mu_peaks)*LSST_scale+mu_peaks

y_full_set_coll.append(y_full_set[:,:])

params_i=params[i,:]

params_coll.append(params_i)

data_scaled_full_set=np.array(y_full_set_coll) #shape (101,9999,50) params_conc=np.array(params_coll)

np.save('data_scaled_full_set.npy',data_scaled_full_set) #shape (101,9999,50) np.save('params_conc.npy',params_conc)

#here are the correct means over the full set of realisations, i.e. mean peak counts of full sims at 101 parameter points data_scaled_full_set_means=np.mean(data_scaled_full_set,axis=1) #shape (101,50) #<----correct file

np.save('data_scaled_full_set_means.npy',data_scaled_full_set_means) #shape (101,50)

#here I load data_scaled.full.npy and data_scaled100subsample_means.npy to check them data_scaled_full_check=np.load('data_scaled.full.npy') #<----wrong file data_scaled_means_over_100subsample=np.load('data_scaled_100subsample_means.npy')

#from here we can see that data_scaled_full_check==data_scaled__means_over_100subsample

print(data_scaled_full_check==data_scaled_means_over_100subsample)

#from here we can see that data_scaled_full_check are not the means over the full set of realisations for each cosmology

print(data_scaled_full_check==data_scaled_full_set_means)

viajani commented 5 years ago

Quick update:

I ran @changhoonhahn 's skewer_cloud.py using the peaks means over the full set of realisations. I attach the posteriors I get.

problem: I tried to implement Vanessa's updated version of pydelfi to change the learning rate in skewer_cloud.py using the git command !pip install git+https://github.com/VMBoehm/pydelfi.git (in colab) and following Vanessa's notebook but the Delfi objet does not recognise the argument for the learning rate, has someone managed to use the updated version?

posteriors skewer: posterior skewers

posteriors cloud: posterior cloud-2

viajani commented 5 years ago

Hi all, I ran skewer_cloud.py using Vanessa's updated version of Pydelfi (in colab the correct command is !pip install -q https://github.com/VMBoehm/pydelfi/archive/stable.zip, i.e. just considering the correct branch). Now we can change the learning rate, some examples:

Learning rate= 5e-4 learningrate5e-4

Learning rate= 0.0001 learningrate0 0001

Learning rate= 0.001 learningrate0 001

I tested some values changing the order of magnitude from the default value used in tensorflow but, in general, what's the criterion to choose the 'best' learning rate?

viajani commented 5 years ago

Hi all, I don't remember if this is still useful but here's the true posterior (with MCMC) implementing the GP for each bin:

True_posterior_GP_each_bin

liuxx479 commented 5 years ago

Hi Virginia - do you have the code that generated this plot? Thanks!

Hi all, I don't remember if this is still useful but here's the true posterior (with MCMC) implementing the GP for each bin:

True_posterior_GP_each_bin