Closed HinLeung622 closed 4 years ago
Hey @HinLeung622 ,
I had a look through. I can see anything obviously wrong but I do have some questions.
logg = pm.Deterministic('logg', T.log10(1006.67408e-11M/radius*2(1.98e30/6.96e8**2)))
Are you sure this line is correct (I haven't check but my assumption is that complicated code is wrong code :))
Av_list = pm.Deterministic('Av', T.ones(N)*Av)
Did you really want this to be a list of all ones? If yes, there is a much easier way to do this and there is no need for this to be deterministic.
Excellent work though - I suspect you are are only a few bug squishes away from having this work! :)
Given that the simulated data works and the real does not my guess is there is something different in the sim when compared to the real data (for example Av = 1?). Can you make plots of the simulated data and compare to the real data - does anything stand out?
@grd349
The
logg = pm.Deterministic('logg', T.log10(100*6.67408e-11*M/radius**2*(1.98e30/6.96e8**2)))
line is the equation to calculate log g, which is defined as log10(GM/R^2), where GM/R^2 is measured in cm/s^2, so what I did there, is take the mass of the star in solar masses, divide that by the square of the radius of the star in solar radii, multiply that by the factor (100GM_sun)/(R_sun^2) where M_sun is in kg and R_sun is in meters (the 100 factor is to convert the value from m/s^2 to cm/s^2). I know using raw numbers is not a good practise, but it should work in theory.
I made the Av_list all ones since I am currently assuming all stars to have the same extinction.
Here is a HR comparison between the simulated cluster and the data for M67: members = simulated data, with the parameters:
N=20
[Av, dist_mod] = [0.160758, 9.660732908839677]
Mbol = 4.75
Tmass = np.random.rand(N)*(1.44-0.84)+0.84
Tage = np.random.randn(N)*0.1+3.52
Tfeh = np.random.randn(N)*0.05+0.1
TY = np.random.randn(N)*0.01+0.255
TMLT = np.ones(N)*2.1
basically the same as those for M67
M67 = M67 data directly collected from gaia old = calculated Bp-Rp and apparent G mags from our previous sample of stars, that have literature Teff values.
As you can see, all three groups of data overlap pretty much in the same region. So in theory if the code is able to do the simulated cluster, it should be able to do M67.
Could you run the same again but instead of
Age_sigma = pm.Lognormal('spread_age',T.log(0.15),0.4)
do
Age_sigma = 0.05
?
Also while we are here and you are doing the dev work, it might make sense to set (do you have diffusion?)
feh_sigma = 0.02 Y_sigma = 0.001
In fact, if you were to max pool the age, feh (diffusion?), and Y_init I think that might be an easier model to debug.
@grd349 you are suggesting to set the sigma values into fixed values?
No, this model does not have diffusion, we do not have a diffusion grid yet (I think Tanda's grid is made but not ready), nor a NN trained on it
@grd349 before receiving your suggestions, i made a HBM run on just 20 of the M67 stars with not a lot of tuning and sampling: Even though it finished, the posteriors look horrible, this might suggest there is something inherently wrong with the M67 data
Yes - I'm suggesting fixing some of the sigmas to make the sampling easier. Just for now to try and debug what is going wrong in the real data model.
I am also suggesting that taking this one step further would be to remove the spreads in various parameters as we are unlikely to be able to resolve them (maybe, maybe not) but whie you are developing the model it will be easier without the spreads.
We will need a grid with diffusion but Tanda has this already! :). Maybe ask Tanda about the grid :)
@grd349 Ok, I am running one for M67 without spreads now, and I also sent an email to Tanda
@grd349 I did two HBM runs, one with the original 100 stars but without spread on age, feh, Y or MLT: Although there were 0 divergences, I don't think that looks like a particularly promising result.
The other HBM run, is with 20 sampled stars from the 100, with the original spread values, but with 100 times larger observational uncertainties. I did this since I noticed the only major difference between the simulated stars and M67 data, is that my assigned observational errors for the simulation is much higher than recorded in M67 data, here is the result: Many divergences but fairly normal looking posteriors, what do you think about this?
We should discuss this on Friday (or earlier). I'll see if I can find some time to look at your model in more detail and see if anything stands out.
Are you giving sensible starting guess to the sampler?
@grd349 Sure, Friday is fine by me, or anytime earlier as well.
This is the starting guesses I used on the 100 star model without any sigmas:
start = {'mean_age':3.5, 'mean_feh':0.1, 'mean_Y':0.26, ', 'mean_MLT':2.0}
I think that should be quite sensible starting points.
You could think about setting a mass starting values (1.0 for everything?)
@grd349 Ok i can try that
@grd349 Results with the 1.0 mass starting value: Not good either
Can you try it with only one star? Something is wrong somewhere.
@grd349 using only one star, with this data: After 500 tuning steps and 500 sample steps: I am not sure if this says anything about the code, this star might be too low mass to be able to test it properly, I will rerun it with a different star in the mean time
Yes - I think a different star would be better.
this star gives this: seems to be doing better than before
Probably needs to run for a bit longer.
@grd349 yeah that is only 500 tuning steps and 500 samples, it is much less than the number I used previously for results in the report. But I think that shows that for one star, it is possible to get sensible results. Maybe expand to 5 stars?
5 stars or run for longer to make sure? Either ... :)
5 stars between gaia_lum = 0.8 to 1.2, 500 tuning steps and 500 samples:
More tuning required but looking good!
@grd349 More tuning, increased from 500 steps to 2000 steps: I am not sure if this is an improvement or not, but I suspect there might be something going on with me restricting all stars to the same age, causing the rather unpleasant age posterior. For the next run, I will reintroduce mu age and spread age, and increase the luminosity range to 0.5-1.5 Lsun.
@grd349 Maybe I was correct, this is 5 stars with a normal age distribution with a free age_mu but fixed age_sigma=0.05, and it seems to converge better in the same number of tuning steps, despite haveing more varying masses: Next I will increase to 20 stars and reintroduce distributions for feh and Y, but still fixing their sigmas.
Sampling in age still looks a bit 'squiffy'. Could you paste the code of the pymc3 model into the issue and I'll see if there is anything obvious we could improve. (could you post the max pooled version, not the partially polled with fixed sigma).
Having said that, things do look a lot better than before!
Max pooled code:
model = pm.Model()
with model:
Age_mu = pm.Deterministic('mean_age',pm.Beta('a',1.1,1.1)*2+2.5)
feh_mu = pm.Deterministic('mean_feh',pm.Beta('e',1.1,1.1)*0.4-0.2)
Y_mu = pm.Deterministic('mean_Y',pm.Beta('f',1.1,1.1)*0.04+0.24)
MLT_mu = pm.Deterministic('mean_MLT',pm.Beta('g',1.1,1.1)*0.6+1.7)
M = pm.Deterministic('mass', pm.Beta('d',1.1,1.1,shape=N)*(1.5-0.8)+0.8)
Age = pm.Deterministic('age',T.ones(N)*Age_mu)
feh = pm.Deterministic('feh',T.ones(N)*feh_mu)
Y = pm.Deterministic('Y',T.ones(N)*Y_mu)
MLT = pm.Deterministic('MLT',T.ones(N)*MLT_mu)
obs = pm.Deterministic('obs',m1.manualPredict(T.log10([M, Age, 10**feh, Y, MLT])))
Teff = pm.Deterministic('Teff', 10**obs[1,:]*5000)
radius = pm.Deterministic('radius', 10**obs[0,:])
L = pm.Deterministic('L', radius**2*(Teff/5776.02970722)**4)
logg = pm.Deterministic('logg', T.log10(100*6.67408e-11*M/radius**2*(1.98e30/6.96e8**2)))
Av_list = pm.Deterministic('Av', T.ones(N)*Av)
BCs = pm.Deterministic('BCs', t1.manualPredict(T.as_tensor_variable([T.log10(Teff), logg, feh, Av_list])))
true_mG = pm.Deterministic('true_mG', -2.5*T.log10(L)-BCs[5,:]+Mbol+dist_mod)
true_Bp_Rp = pm.Deterministic('true_Bp_Rp', BCs[8,:]-BCs[7,:])
obs_mG = pm.Normal('obs_mG',true_mG,sigma=M67['g_mag_err'], observed=M67['g_mag'])
obs_Bp_Rp = pm.Normal('obs_Bp_Rp',true_Bp_Rp,sigma=M67['Bp_Rp_err'], observed=M67['Bp_Rp'])
Also, since the last update, I have reintroduced distributions in age, feh and Y into the model, but with this, the model either takes much more time to sample (at least 4x more time with the same number of stars and tuning steps), or errors with the ravel is equal to zero
error. All of my runs attempted have either errored or timed out on bluebear.
Thanks @HinLeung622
I suggest you change ...
Age_mu = pm.Deterministic('mean_age',pm.Beta('a',1.1,1.1)*2+2.5)
to
Age_mu = pm.Normal('mean_age', 4.0, 1.0)
That may help with the sampling. Basically, the beta dist as you have it has a boundary and the sampler will not enjoy that. The other instances of the beta dist are probably better constrained by the data and hence do not have a boundary in the posterior. Try it and see if things get better ... :)
@grd349
@grd349 I got this error from bluebear about the line we just added, I have no idea what that means
Using TensorFlow backend.
2020-07-16 15:57:31.154009: I tensorflow/core/common_runtime/process_util.cc:115] Creating new thread pool with default inter op setting: 7. Tune using inter_op_parallelism_threads for best performance.
You can find the C code in this temporary file: /tmp/theano_compilation_error_ggm2qr56
Traceback (most recent call last):
File "HBM_post_M67.py", line 60, in <module>
Teff_obs = pm.Normal('Teff_obs', Teff, sigma=500, observed=5500)
File "/rds/homes/h/hhl622/.local/lib/python3.7/site-packages/pymc3/distributions/distribution.py", line 47, in __new__
return model.Var(name, dist, data, total_size)
File "/rds/homes/h/hhl622/.local/lib/python3.7/site-packages/pymc3/model.py", line 952, in Var
total_size=total_size, model=self)
File "/rds/homes/h/hhl622/.local/lib/python3.7/site-packages/pymc3/model.py", line 1488, in __init__
self.logp_elemwiset = distribution.logp(data)
File "/rds/homes/h/hhl622/.local/lib/python3.7/site-packages/pymc3/distributions/continuous.py", line 519, in logp
sigma > 0)
File "/rds/homes/h/hhl622/.local/lib/python3.7/site-packages/pymc3/distributions/dist_math.py", line 50, in bound
return tt.switch(alltrue(conditions), logp, -np.inf)
File "/rds/bear-apps/2019b/EL7-haswell/software/Theano/1.0.4-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/theano/gof/op.py", line 670, in __call__
no_recycling=[])
File "/rds/bear-apps/2019b/EL7-haswell/software/Theano/1.0.4-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/theano/gof/op.py", line 955, in make_thunk
no_recycling)
File "/rds/bear-apps/2019b/EL7-haswell/software/Theano/1.0.4-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/theano/gof/op.py", line 858, in make_c_thunk
output_storage=node_output_storage)
File "/rds/bear-apps/2019b/EL7-haswell/software/Theano/1.0.4-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/theano/gof/cc.py", line 1217, in make_thunk
keep_lock=keep_lock)
File "/rds/bear-apps/2019b/EL7-haswell/software/Theano/1.0.4-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/theano/gof/cc.py", line 1157, in __compile__
keep_lock=keep_lock)
Do we need to specify the shape of the distribution for it to work?
Using TensorFlow backend. 2020-07-16 16:28:53.439874: I tensorflow/core/common_runtime/process_util.cc:115] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
You can find the C code in this temporary file: /tmp/theano_compilation_error_r6cth8yt
Traceback (most recent call last):
File "HBM_post_M67.py", line 32, in
@grd349 Results on giving Teff a Teff_obs, 5 stars, max pool, 2000 tuning steps: Seems to be on the right track, I will try partial pool now
Rhat is too high. Is this with the beta prior on age or normal?
@grd349 it is with beta prior
@grd349 2 reruns of the model but with beta(2,2) on age_mu both ended on ravel equals zero error, I think the Teff_obs did not solve the entirety of the problem
@grd349 I tried the followings:
Teff_obs = pm.Normal(Teff, 250, 5500)
(sigma was 500 before)
They all ended in the ravel is zero error and chain breaking.@grd349
I tried using a smaller group of stars of M67, that are only dwarfs:
with a max pool model with 20 stars, age_mu = beta(1.1,1.1), Teff_obs = pm.Normal(Teff, 500, 5500)
, the ravel is zero error still happens for all 3 times I tried it.
@grd349 simulated data, with estimated parameters from our report of M67 as the cluster-wide fundamentals, for 50 stars, with the regular partial pool model (except alpha MLT) with 1000 tuning steps and 500 samples, this is the result: I would say it ran rather successfully and gave good results.
So the problem really is between the natures of simulated data and real data somehow.
@grd349 I think I am missing one thing in using fully analytical methods in the conversion from Teff and luminosity into apparent G mag and Gbp-rp colour, that is a continuous equation/fit of BCbp and/or BCrp as a function of Teff or more.
The paper you suggested, Casagrande 2018, includes a calculated table of BCs of G, Bp and Rp bands of Gaia given increments of a star's Teff, logg, [Fe/H], [alpha/Fe] and E(B-V), and also a fitted equation for the calculation of R_lambda, the extinction coefficient for all three bands, dependent on the star's Teff and Fe/H. The R_lambda fit solves our problem with accounting for reddenning, but an analytical conversion from Teff, feh etc to BCs is still lacking.
We used the polynomial fit that is only dependent on Teff (assuming fixed feh and logg) for BCg in Andrae 2018 (https://www.aanda.org/articles/aa/abs/2018/08/aa32516-17/aa32516-17.html) previously in our report, but this treatment (of a polynomial fit for BC) was only done on G band, as they used a machine learning approach to calculate their Teffs directly from reddenning-corrected Gbp-Grp colours, hence skipping the BC calculation section.
So any ideas?
Thanks @HinLeung622
Yes - I forgot it was a table. Is it easy to plot the data from the table (i.e., seaborn pairplot)? If yes can we look at that plot first and then make a plan.
@grd349
I downloaded the BC tables and looked at the data structure:
Under the path BCtables2\al2fe_p00\r00\GaiaDR2\GaiaG.data
, it looks like this:
Assuming "al2fe" means [alpha/Fe] and "r00" means E(B-V) = 0.00, there is no clear indication which row of data refers to what Teff, logg or [Fe/H] values.
The Gaia Casagrande 2018 paper referred to one of their earlier papers: Casagrande 2018a (https://arxiv.org/pdf/1801.05508.pdf)'s appendix A for descriptions of the tables and codes, but as far as I can see, there is more emphasis on their FORTRAN code included alongside the tables to query for BCs of specific passbands, rather than stating how the tables are structured. And FORTRAN is not something I know of.
OK - thanks. I'm a bit lost now. I looked at the link and appendix A is too long for me to follow right now. Perhaps the easiest way to do this is to email Luca Cassagrande? He might know of a python implementation :)
@grd349 Yeah I read through appendix A but didn't find any direct information about the data structure or what are the columns. You sure it is a good idea for me to contact Dr. Cassagrande directly?
Yes - a very good idea! You could even engage him by asking if you could send him your draft when you are ready :)
luca.casagrande@anu.edu.au
Luca is a great person and is very easy to work with. He is also extremely clever and a real expert in all this.
@grd349 You mean the draft of the paper in the future?
Ok I will email him then
Yes - draft of the paper in the future :)
G
Dr Guy R. Davies PI - ERC CartographY Project Senior Lecturer in Astrophysics School of Physics and Astronomy The University of Birmingham Edgbaston Birmingham B15 2TT
Tel +44 (0) 121 414 4597 G.R.Davies@bham.ac.uk grd349@gmail.com davies@bison.ph.bham.ac.u davies@bison.ph.bham.ac.ukk davies@bison.ph.bham.ac.uk
The contents of this e-mail may be privileged and are confidential. It may not be disclosed, used, or copied in any way by anyone other than the addressee. If received in error please notify the sender then delete it from your system. Should you communicate with the sender by e-mail, you consent to The University of Birmingham monitoring and reading any such correspondence.
On Mon, 20 Jul 2020 at 11:45, HinLeung622 notifications@github.com wrote:
@grd349 https://github.com/grd349 You mean the draft of the paper in the future?
Ok I will email him then
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Harry-Westwood/Y4-Project-InterNeuralStellar/issues/44#issuecomment-660951901, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQZOCMX3NV2TXCD3LYKNSDR4QN47ANCNFSM4OWUUXDA .
@grd349 So with the reply from Casagrande, I have looked a little more into his BC table. It seems that the main functionality of the tables involves two files: input.samples to specify the list of stellar inputs (star ID, logg, feh, Teff and E(B-V)) and selectbc.data to specify the [alpha/Fe] variation relation mode with [Fe/H], and the list of bands for the BCs to be calculated for. Then a shell script can be ran to execute all required fortran codes to return an output table of BCs interpolated from their BC tables. I am using a windows device so looking at Casagrande's reply, I won't be able to run the codes locally, it might also be very complicated to do it on bluebear. But in any case, their BC tables provide BCs with some level of interpolation from a large grid of values, which is still some steps away from a continuous conversion equation that we are trying to aim for to be put in the HBM currently.
Side note: I reran the NN version of the HBM yesterday but with a smaller range of gaia-luminosities for the stars. I lowered the range from 0.5-1.5 into 0.8-1.2, and it managed to sample all posteriors without running into the ravel is zero error. The results are no where near good results but at least the error did not happen. I wonder what is so different between the slightly non-solar-like stars with higher and lower luminosities and the more solar-like stars with more similar luminosities that cause the HBM to break.
OK - running on BlueBEAR should be easy enough in an interactive session (I think you use these already no?). Failing that we could see if someone can help out on this (maybe me). What's the plan? Do you need someone to run the code and then we will build a NN from there?
@grd349 Here is a link to the python notebook that I used for fundamentals estimation of a M67-like pseudo cluster of 20 stars: https://github.com/Harry-Westwood/Y4-Project-InterNeuralStellar/blob/master/Hin's_files/NNcluster_post.ipynb
And this is my code for the same process for 100 stars from M67:
The NN pseudo cluster works but not the M67 HBM, it errored with the following error from bluebear: