Open changhoonhahn opened 5 years ago
@viajani pointed out that that data I used in the skewer vs cloud calculations are means over 100 realizations. Although it's unlikely to change the results significantly, calculation should be updated to use the means over the full set of realizations.
@viajani also pointed out that my emulated peak counts are negative, which is unphysical. Impose non-negativity
mean peak counts of full sims at 101 parameter points : data_scaled.full.npy shape=(101,50) corresponding parameter values: params_conc.full.npy shape=(101,3)
@viajani these are the correct files, right?
@changhoonhahn actually no, I just checked and 'data_scaled.full.npy' that we computed at the workshop is the same as 'data_scaled_100subsamples_means.npy'
I pulled a request with the correct file 'data_scaled_full_set_means.npy' shape=(101,50) which are the means peak counts of full sims at 101 parameter points and updated the data_description_folder adding the description for this new file.
Here's how I verified that,
if you want to double check
import numpy as np
data_sim=np.array([np.load('%s_Peaks_KN_s2.00_z1.00_ng40.00_b050.npy'%(ifn))[2:] for ifn in fn])
#LSST Scaling
LSST_scale=np.sqrt(12.25/20000)
y_full_set_coll=[]
params_coll=[]
for i in range(len(params)):
mu_peaks=np.mean(data_sim[i,:,:],axis=0)
y_full_set = (data_sim[i,:,:]-mu_peaks)*LSST_scale+mu_peaks
y_full_set_coll.append(y_full_set[:,:])
params_i=params[i,:]
params_coll.append(params_i)
data_scaled_full_set=np.array(y_full_set_coll) #shape (101,9999,50)
params_conc=np.array(params_coll)
np.save('data_scaled_full_set.npy',data_scaled_full_set) #shape (101,9999,50)
np.save('params_conc.npy',params_conc)
#here are the correct means over the full set of realisations, i.e. mean peak counts of full sims at 101 parameter points
data_scaled_full_set_means=np.mean(data_scaled_full_set,axis=1) #shape (101,50) #<----correct file
np.save('data_scaled_full_set_means.npy',data_scaled_full_set_means) #shape (101,50)
#here I load data_scaled.full.npy and data_scaled100subsample_means.npy to check them
data_scaled_full_check=np.load('data_scaled.full.npy') #<----wrong file data_scaled_means_over_100subsample=np.load('data_scaled_100subsample_means.npy')
#from here we can see that data_scaled_full_check==data_scaled__means_over_100subsample
print(data_scaled_full_check==data_scaled_means_over_100subsample)
#from here we can see that data_scaled_full_check are not the means over the full set of realisations for each cosmology
print(data_scaled_full_check==data_scaled_full_set_means)
I ran @changhoonhahn 's skewer_cloud.py using the peaks means over the full set of realisations. I attach the posteriors I get.
problem: I tried to implement Vanessa's updated version of pydelfi to change the learning rate in skewer_cloud.py using the git command !pip install git+https://github.com/VMBoehm/pydelfi.git
(in colab) and following Vanessa's notebook but the Delfi objet does not recognise the argument for the learning rate, has someone managed to use the updated version?
posteriors skewer:
posteriors cloud:
Hi all, I ran skewer_cloud.py using Vanessa's updated version of Pydelfi (in colab the correct command is !pip install -q https://github.com/VMBoehm/pydelfi/archive/stable.zip
, i.e. just considering the correct branch).
Now we can change the learning rate, some examples:
Learning rate= 5e-4
Learning rate= 0.0001
Learning rate= 0.001
I tested some values changing the order of magnitude from the default value used in tensorflow but, in general, what's the criterion to choose the 'best' learning rate?
Hi all, I don't remember if this is still useful but here's the true posterior (with MCMC) implementing the GP for each bin:
Hi Virginia - do you have the code that generated this plot? Thanks!
Hi all, I don't remember if this is still useful but here's the true posterior (with MCMC) implementing the GP for each bin: