Helping HBM sample better (cont. cont.)

HinLeung622 commented 4 years ago

This is a continuation of issue #47 . A new thread is made since the last one is getting ridiculously long and it takes ages to load in all the images.

State of the HBM: data = M67 low mass dwarfs with binaries/double stars included (total = 354 stars) + the cluster-average-parallax from GAIA https://arxiv.org/pdf/1804.09378.pdf + the surface metallicity of 10 MS stars from https://iopscience.iop.org/article/10.3847/1538-4357/aab612

full model: 2x working versions, one with max pool MLT, one with partial pool MLT. The max pool MLT:

model = pm.Model()
with model:

    #cluster-wide fundamentals
    Age_mu = pm.Deterministic('mean_age',pm.Beta('age_beta',2,2)*2+3)
    feh_mu = pm.Deterministic('mean_feh',pm.Beta('feh_beta',2,2)*0.3-0.15)
    Y_mu = pm.Deterministic('mean_Y',pm.Beta('Y_beta',2,2)*0.0334+0.2466)#0.24668 \pm 0.00007 is big bang nucleosynthesis fraction from Plank mission
    MLT_mu = pm.Deterministic('mean_MLT',pm.Beta('MLT_beta',10,10)*0.6+1.7)

    #per star fundamentals
    M = pm.Deterministic('mass', pm.Beta('mass_beta',1.1,1.1,shape=N)*(1.2-0.8)+0.8)
    Age = pm.Deterministic('age',T.ones(N)*Age_mu)
    feh = pm.Deterministic('feh',T.ones(N)*feh_mu)
    Y = pm.Deterministic('Y',T.ones(N)*Y_mu)
    MLT = pm.Deterministic('MLT',T.ones(N)*MLT_mu)

    #NN calculation
    obs = pm.Deterministic('obs',m1.manualPredict(T.log10([M, Age, 10**feh, Y, MLT])))

    #intermediate observables
    radius = pm.Deterministic('radius', 10**obs[0])
    Teff = pm.Deterministic('Teff', (10**obs[1])*5000)
    L = pm.Deterministic('L', (radius**2)*((Teff/Teff_sun)**4))
    logg = pm.Deterministic('logg', T.log10(100*constants.G.value*(M/radius**2)*(constants.M_sun.value/constants.R_sun.value**2)))
    true_s_feh = pm.Deterministic('true_s_feh', obs[3])
    obs_s_feh = pm.Normal('obs_s_feh', true_s_feh, M67['[Fe/H]_err'], observed=M67['[Fe/H]'])

    #extinction prior
    A_V_mu = pm.Deterministic('mean_A_V', pm.Beta('Av_beta',1.1,1.1)*0.2) # X & Y are 'sensible' values
    A_V_std = pm.Lognormal('spread_A_V', T.log(0.01), 1.0) # Different X & Y but still sensible
    A_V_ = pm.Normal('A_V_', 0, 1, shape=N) # an N(0, 1) dist to be scaled
    A_V = pm.Deterministic('A_V', A_V_ * A_V_std + A_V_mu)

    #second NN calculation
    BCs = pm.Deterministic('BCs', t1.manualPredict(T.as_tensor_variable([T.log10(Teff), logg, feh, A_V])))

    #BCs
    BCg = pm.Deterministic('BCg', BCs[5,:])
    BCbp = pm.Deterministic('BCbp', BCs[7,:])
    BCrp = pm.Deterministic('BCrp', BCs[8,:])

    #distance and parallax
    dist_mod_ = pm.Normal('dist_mod', dist_mod, 1.0)
    true_parallax = pm.Deterministic('true_parallax', 10**(-dist_mod_/5-1)*1000)
    zero_pt = pm.Normal('zero_pt', M67_zero_point[0]/1000, M67_zero_point[1]/1000)
    obs_parallax = pm.Normal('obs_parallax', true_parallax+zero_pt, M67_parallax[1], observed=M67_parallax[0])

    #true observables
    true_mG = pm.Deterministic('true_mG', -2.5*T.log10(L)+Mbol-BCg+dist_mod_)
    true_Bp_Rp = pm.Deterministic('true_Bp_Rp', BCrp - BCbp)

    #dealing with multiples
    q = pm.Beta('q', 1.1,1.1, testval=0.8)
    delta_mG = pm.Lognormal('delta_mG', T.log(0.3), 0.4)
    f = pm.Lognormal('f', T.log(10),0.7)
    sigma_multiple = pm.Lognormal('sigma_multiple', T.log(0.3), 0.3)
    dist_singular = pm.Normal.dist(0, np.mean(M67['g_mag_err'])*f)
    dist_multiple = pm.Normal.dist(-delta_mG, sigma_multiple)

    #obs observables
    obs_mG = pm.Mixture('obs_mG', w=[q, 1-q], comp_dists = [dist_singular, dist_multiple], \
                    observed=M67['g_mag'].values - true_mG)
    obs_Bp_Rp = pm.Normal('obs_Bp_Rp', mu=true_Bp_Rp, sigma=M67['Bp_Rp_err'], observed=M67['Bp_Rp'])

with priors: mean age ............~ beta(2,2)2+3 mean feh .............~ beta(2,2)0.3-0.15 mean Y .................~ beta(2,2)0.0334+0.2466 #0.24668 \pm 0.00007 is big bang nucleosynthesis fraction from Plank mission mean MLT ...........~ beta(10,10)0.6+1.7 #whole range of MLT in training grid mass ......................~ beta(1.1,1.1,shape=N)0.4+0.8 #0.8 to 1.2 mean Av ...............~ beta(1.1,1.1)0.2 #0.0 to 0.2 spread Av ............~ lognormal(log(0.01), 1.0) dist_mod .............~ N(9.726, 1.0) #9.726 is taken from the GAIA DR2 cluster paper zero_pt .................~ N(-82, 33) #in micro arc seconds, from https://ui.adsabs.harvard.edu/abs/2018ApJ...862...61S/abstract q ..............................~ beta(1.1,1.1) delta_mG .............~ lognormal(log(0.3), 0.4) f ...............................~ lognormal(log(10), 0.7) sigma_multiple ..~ lognormal(log(0.3),0.3)

The max pool MLT case assumes a MLT dependency on star mass, has these additional elements:

    #cluster-wide fundamentals
    Age_mu = pm.Deterministic('mean_age',pm.Beta('age_beta',2,2)*2+3)
    feh_mu = pm.Deterministic('mean_feh',pm.Beta('feh_beta',2,2)*0.3-0.15)
    Y_mu = pm.Deterministic('mean_Y',pm.Beta('Y_beta',2,2)*0.0334+0.2466)#0.24668 \pm 0.00007 is big bang nucleosynthesis fraction from Plank mission
    MLT_mu = pm.Deterministic('mean_MLT',pm.Beta('MLT_beta',10,10)*0.6+1.7)
    MLT_f = pm.Normal('MLT_f', 0, 0.3)

    #per star fundamentals
    M = pm.Deterministic('mass', pm.Beta('mass_beta',1.1,1.1,shape=N)*(1.2-0.8)+0.8)
    Age = pm.Deterministic('age',T.ones(N)*Age_mu)
    feh = pm.Deterministic('feh',T.ones(N)*feh_mu)
    Y = pm.Deterministic('Y',T.ones(N)*Y_mu)
    MLT = pm.Deterministic('MLT', MLT_mu+MLT_f*(M-1.0))

Current best run results: with 20000 tuning steps and 1000 sampling steps (with tighter mean Av and spread Av priors): The remaining problem is the high Rhats and low effective sample size in dist mod and zero pt (see #49 ) Partial MLT works except for again, the problems with dist mod and zero pt. The estimated slope of MLT against mass is very slightly negative, suggesting no need for partial pooling MLT.

HinLeung622 commented 4 years ago

@grd349 Results for relaxing the priors on mean Av and spread Av came out. The current priors are: mean Av ...............~ beta(1.1,1.1)*0.2 #0.0 to 0.2 spread Av ............~ lognormal(log(0.01), 1.0) For 5000 turning and 1000 sampling steps: Immediately it is clear that the less informative priors on Av have gave rise to instable Rhats. The mean Av estimate increased from roughly 0.13 to now 0.16, going from being in line with the GAIA OC paper (0.13) and DBossini (0.11) to be in line with Green's 2019 dustmap (0.16). Secondly, the spread Av estimate widened from roughly 0.01 to 0.03. This widened the width of the main sequence of red dots in the CMD. Though it is reassuring to see that the increased width of the Av distribution did not overlap with the binary track, meaning the red dots are still localized to the MS track, and the binary stars are still modelled with the mixture model. Thirdly, the width of the estimated observational G mag error went down from 160 to 105, most likely due to the larger spread Av. This means the singular distribution in the mixture model (last plot) is thinner.

If you think the estimates on mean and spread Av are physical, I think this is again just a convergence problem, which just means putting on more tuning steps.

HinLeung622 commented 4 years ago

@grd349 Something interesting came out: the first run for NGC188, which has 501 stars (more than M67's 364) Note: I did a huge derp by forgetting to change the data values of dist mod and observed parallax (and its error) from M67 values into NGC188. This means that the estimates for dist_mod is slightly off in the following results, and zero_pt offset is plain impossible. But other than that, things look good: Exciting results:

Rhats are surprisingly low, maybe cause there are more stars
sampler is able to recover a semi-correct dist_mod (GAIA value is 11.49, due to the mistake above, the prior is set to N(9.7, 1.0))
the uncertainties on a lot of the fundamentals are very small! Eg. age is 55 Myr!
helium estimate is comfortably above unphysical regions
the estimate for spread Av is roughly the same as that for M67 after relaxing the prior. This possibly suggest similar degree of in-cluster-extinction from OC to OC
q, f, delta_mG and sigma_multiples estimates are similar to those from M67. These parameters describe properties of an OC that should be consistent from OC to OC (eg binary fraction, the delta_mG to add to the MS track to get the binaries, etc.). This is reassuring.
similar covariances are seen in the corner plot to M67's results, eg. zero pt vs [feh, MLT, mean Av]
the CMD looks amazing

Worrying results:

age estimate, similar to M67, is considerably lower than literature values (7.53, 5.78, 6.45)
again, small number of effective samples

Overall exciting results

grd349 commented 4 years ago

@HinLeung622 That is an exciting result.

Do you have a third older (> 2.5 Gyr) cluster that we could consider like this? Three results that all estimate the age to be lower than literature would be interesting!

HinLeung622 commented 4 years ago

@grd349 Thanks.

The other candidate we have is NGC6791, but that one is missing most of its MS tail, which will create problems on separating MS and subgiants/hook and binary star modelling.

Ruprecht 147 is another option, but I don't know how many star it has in Gaia in MS. Literature values range from 2.5 to 3.25. Our report-stage estimate was however 2.18 with a measly 11 MS stars. This can be an option.

grd349 commented 4 years ago

Probably best to stick at just two examples then.

the CMD looks amazing

Agreed! 👍

again, small number of effective samples

Agreed!

HinLeung622 commented 4 years ago

@grd349 I looked into the state of Ruprecht 147. After all the data cleaning, there are 101 MS stars left that are roughly within the boundaries of the training grid: Problem is, when I plot what my NNs expect stars from 0.8 to 1.2 Msun would look like given Ruprecht 147 literature values for fundamentals next to the data dots: Yeah, I think we will have to ditch Ruprecht 147

grd349 commented 4 years ago

Agreed 👍

On Fri, 9 Oct 2020, 16:46 HinLeung622, notifications@github.com wrote:

@grd349 https://github.com/grd349 I looked into the state of Ruprecht 147. After all the data cleaning, there are 101 MS stars left that are roughly within the boundaries of the training grid: [image: image] https://user-images.githubusercontent.com/56083013/95603792-56eb7180-0a89-11eb-93f1-333dbf941a60.png Problem is, when I plot what my NNs expect stars from 0.8 to 1.2 Msun would look like given Ruprecht 147 literature values for fundamentals next to the data dots: [image: image] https://user-images.githubusercontent.com/56083013/95603954-93b76880-0a89-11eb-809f-753b9fd59ab4.png Yeah, I think we will have to ditch Ruprecht 147

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Harry-Westwood/Y4-Project-InterNeuralStellar/issues/50#issuecomment-706257382, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQZOCNU3NL2TUJ7NQK3H3DSJ4V5ZANCNFSM4SIO725Q .

HinLeung622 commented 4 years ago

@grd349 This is NGC6791, the one without the tail, do you think it has a chance? Note: there are 766 stars!

grd349 commented 4 years ago

Not sure that does have any chance - no, best leave it!

On Fri, 9 Oct 2020 at 17:02, HinLeung622 notifications@github.com wrote:

@grd349 https://github.com/grd349 This is NGC6791, the one without the tail, do you think it has a chance? Note: there are 766 stars! [image: image] https://user-images.githubusercontent.com/56083013/95605592-c06c7f80-0a8b-11eb-8aa4-1ca51928f821.png

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/Harry-Westwood/Y4-Project-InterNeuralStellar/issues/50#issuecomment-706265576, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQZOCMTCPV2KBBR3E6RJC3SJ4XYDANCNFSM4SIO725Q .

-- Dr Guy R. Davies PI - ERC CartographY Project Senior Lecturer in Astrophysics School of Physics and Astronomy The University of Birmingham Edgbaston Birmingham B15 2TT

Tel +44 (0) 121 414 4597 G.R.Davies@bham.ac.uk grd349@gmail.com davies@bison.ph.bham.ac.u davies@bison.ph.bham.ac.ukk davies@bison.ph.bham.ac.uk

The contents of this e-mail may be privileged and are confidential. It may not be disclosed, used, or copied in any way by anyone other than the addressee. If received in error please notify the sender then delete it from your system. Should you communicate with the sender by e-mail, you consent to The University of Birmingham monitoring and reading any such correspondence.

HinLeung622 commented 4 years ago

@grd349 I fixed the previous problem with NGC188 (I used the M67 dist_mod, Av and parallax values before) and this is the newest run of max pool MLT result: It's great to see relatively lower Rhats, especially in dist_mod and zero_pt. The corner plot is also much more well defined. The age stays lower than literature values. I am waiting on runs of M67 with a large number of tuning steps.

grd349 commented 4 years ago

Wow - that looks great! MLT is a bit higher for this cluster compared to M67 no?

On Mon, 12 Oct 2020 at 08:45, HinLeung622 notifications@github.com wrote:

@grd349 https://github.com/grd349 I fixed the previous problem with NGC188 (I used the M67 dist_mod, Av and parallax values before) and this is the newest run of max pool MLT result: [image: image] https://user-images.githubusercontent.com/56083013/95718659-54fbfb00-0ca1-11eb-97cb-055804dd6bf5.png [image: image] https://user-images.githubusercontent.com/56083013/95718675-5af1dc00-0ca1-11eb-9ea5-39fb257d3eb4.png [image: image] https://user-images.githubusercontent.com/56083013/95718682-5d543600-0ca1-11eb-9feb-c758a5653cfc.png It's great to see relatively lower Rhats, especially in dist_mod and zero_pt. The age stays lower than literature values. I am waiting on runs of M67 with a large number of tuning steps.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Harry-Westwood/Y4-Project-InterNeuralStellar/issues/50#issuecomment-706941493, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQZOCIT3Q66E33RR5SXDVDSKKXYDANCNFSM4SIO725Q .

HinLeung622 commented 4 years ago

@grd349 Yes that seems to be the case, though it is not by much. This one is estimated to have MLT = 2.034 while M67's most resent estimate is 2.009, under the same prior-setting principles

HinLeung622 commented 4 years ago

@grd349 Our current outline is uploaded at https://github.com/Harry-Westwood/Y4-Project-InterNeuralStellar/blob/master/Paper%20Writing/HL_HW_Paper1_Outline.pdf , please have a look.

grd349 commented 4 years ago

Will do!

G

Dr Guy R. Davies PI - ERC CartographY Project Senior Lecturer in Astrophysics School of Physics and Astronomy The University of Birmingham Edgbaston Birmingham B15 2TT

Tel +44 (0) 121 414 4597 G.R.Davies@bham.ac.uk grd349@gmail.com davies@bison.ph.bham.ac.u davies@bison.ph.bham.ac.ukk davies@bison.ph.bham.ac.uk

The contents of this e-mail may be privileged and are confidential. It may not be disclosed, used, or copied in any way by anyone other than the addressee. If received in error please notify the sender then delete it from your system. Should you communicate with the sender by e-mail, you consent to The University of Birmingham monitoring and reading any such correspondence.

On Thu, 15 Oct 2020 at 10:29, HinLeung622 notifications@github.com wrote:

@grd349 https://github.com/grd349 Our current outline is uploaded at https://github.com/Harry-Westwood/Y4-Project-InterNeuralStellar/blob/master/Paper%20Writing/HL_HW_Paper1_Outline.pdf , please have a look.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Harry-Westwood/Y4-Project-InterNeuralStellar/issues/50#issuecomment-709036694, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQZOCIJOSEF36X5GVWLEVDSK26IBANCNFSM4SIO725Q .

grd349 commented 4 years ago

That's looking good!

Don't forget about the surface metallicities estimate data you have used (APOGEE?).

There is quite a large section on the Neural Network. This can probably be paired down a little bit, but not too much. We can discuss.

What's the next step to take then? Do we need to finish up the analysis or do you think you could start writing sections?

HinLeung622 commented 4 years ago

@grd349 Thanks. The surface feh data were taken from a paper you suggested some time ago: https://iopscience.iop.org/article/10.3847/1538-4357/aab612, which is from APOGEE.

The details about the paper are also best discussed on a phone call, so I would recommend arranging a meeting, perhaps later this week in the usual 1-4pm UK time ranges. We also have a sizable list of items that require more thinking:

I have something to report about trying to use 3 bands instead of G band + colour as observables. It is not good news
dist_mod and zero_pt problems, including high Rhats and low effective sample number
reparameterizing the HBM models in hopes that no astronomical number of tuning steps is needed to get low Rhats, or maybe speeding up sampling works too
MIST grid's treatment of Av (I re-read MIST's paper's section on their calculation of BCs the other day, and they do say they take into account the effect of extinction in their BCs, which leads me to believe an additional extinction term is not needed in my magnitude calculations.)
Do we need some crude bias and prior-sensitivity testing?
How to phrase the max pool vs partial pool MLT runs?
Would it be better if we got surface feh measurements for NGC188 as well to be self-consistent? Though I recall all matching NGC188 members from APOGEE are in the sub-giant or RGB phrases, meaning potentially no surface feh data from APOGEE.
Coming up with results and points of discussion
What should we do when referencing Interstellar I when/if it is not published yet by the time we submit ours?

Just some things off the top of my head. The analysis will be slow due to the large number of sampling steps needed per run, so starting to write while waiting on those is probably the better choice. What meeting timeslots do you have?

grd349 commented 4 years ago

Sorry both - I am slammed at the moment. I have scheduled a meeting on Friday at 14:30.

In the mean time - Lindsey Carboneau has got pymc3 and theano running on Google Colab using a GPU instance. It is not impossible that this might significantly speed up the HBM sampling and hence save some headaches. It might also not :)

HinLeung622 commented 4 years ago

@grd349 Thanks, that slot works for the both of us.

Interesting discovery by Lindsey, I will ask her about it through email.

HinLeung622 commented 4 years ago

@grd349 Can you remind me why we are not using the NN's output of surface feh as the inputs for the MIST grid NN to calculate BCs? Is it due to not accounting for core overshoot that is known to be significant for stars of mass >1.15 Msun? But at the same time, the current iteration of M67 HBM is using that output surface feh as the true surface feh values to be constrained by the APOGEE surface fehs. Should I change the surface feh inputs into the MIST NN into the surface fehs?

grd349 commented 4 years ago

Can you remind me why we are not using the NN's output of surface feh as the inputs for the MIST grid NN to calculate BCs?

Yes - I agree - we should probably use the surface feh.

HinLeung622 commented 4 years ago

@grd349 We now have a new problem surfacing: a strong age-extinction degeneracy which has been the culprit in pushing the age estimations down. This is the newest results with the wider age prior (2-6Gyr) and using surface feh as the input to the MIST BC NN: Ignoring the high Rhats (i only put on 5000 tuning steps to test the coarse features), the age estimate has moved from 3.1Gyr to now 2.1 Gyr, again pushing right up against the lower age prior. Similar situation appears with the mean Av estimate: 0.19 when the prior max is 0.2. This is likely why the previous Av estimates were at 0.16, a value higher than GAIA's estimated 0.11 value.

An old corner plot of a well-sampled M67 run when age prior was still at 3-5Gyr shows a little bit of the covariance between age and Av: testings with the CMD also shows the same story: 4.0Gyr + 0.11 Av 2.0Gyr + 0.19 Av Perhaps we need some stars at the turn off to better constraint this degeneracy. Though that runs into the 1.2Msun overshooting limit. Maybe we could somehow constraint Av with another set of observables (another band?). Thoughts?

grd349 commented 4 years ago

another band?

K-band is (relatively) insensitive to extinction.

Suggestion - Make two runs. Set the mean extinction as a fixed parameter in both. In the first run use Greens value and in the second run use the Gaia value. Are you able to hack together something that predicts all of [G, G_bp, G_rp, K, ...?].

This fixing would probably help with the sampling. We can then do a posterior predictive check (probably just graphical) on the K band and see which solution fits best. The other option is trying to deal with multiple bands but that is harder with the noise distributions!

Make sense?

HinLeung622 commented 4 years ago

@grd349 Makes sense. I would go for the most aggressive solution for now (ie. re-training a BC NN that has both GAIA bands and 2MASS K, and then build 2 separate noise models for the G and K bands). I do not quite understand what you mean by "We can then do a posterior predictive check (probably just graphical) on the K band and see which solution fits best.", are you saying to choose between GAIA G and 2MASS K and see which one fits best? I have checked MIST's BC grid, they have 2MASS H, J and K BCs. There are both Kp and Ks, do you know what are the difference between these two? I also checked and for both M67 and NGC188 target stars that we adopt, >95% of them have 2MASS IDs. Assuming this means they have K band values, we will not be losing anything significant if we add in K band into the mix.

HinLeung622 commented 4 years ago

@grd349 Oh nevermind, the Kp is Kepler Kp, not from 2MASS. I have re-gathered the data for M67 and NGC188 now with 2MASS Ks bands mags and error on mags. That reduced our sample size by less than 10 stars on both clusters. I will now proceed to train a NN that maps [log10 Teff, logg, surface feh, Av] into ['Gaia_G_MAW','Gaia_BP_MAWb','Gaia_BP_MAWf','Gaia_RP_MAW','2MASS_Ks'] (I have been using the "Gaia_Bp_MAWf" option of BP band BC before, not sure if I have ever checked if it was the right choice.)

HinLeung622 commented 4 years ago

@grd349 I retrained a NN that includes 2MASS Ks band BC as an output. I also took the opportunity to improve on some of the original NN errors and biases on the other bands (admittedly they were quite large, on the order of ~0.01 mags, comparable to observational error. The new errors are on the order of 0.003 mags). But this also means the mapping from fundamentals to magnitudes has been slightly altered.

Next, I set up a HBM with a separate mixture model for G and Ks bands. The only parameter they share is the singles vs multiples fraction. This run is also done with the zero-pt offset edit we did last week. I can't say the results are promising. We might be at the mouth of another rabbit hole (tuning steps=5000, sampling steps=1000, not converged, just running a low tuning trial to test the coarse trends): First of all, the addition of K band data did not fix the low age problem. It might have broken the age-Av degeneracy considering the mean Av estimate here is ~0.15, quite a bit lower than the 0.19 I got previously when the age came out to be ~2.1. Age might be degenerate with something else now. Here is a corner plot (it is very large so you might have to download/view image in a separate tab to see it properly): Secondly, the dist_mod estimate is very low. Gaia determines M67's dist mod to be ~9.72, most previous HBM runs result in estimates within 9.70 to 9.75. I tested that this is not the result of a mistake I made when changing the treatment of the zero pt offset by plotting the predicted M67 MS (the red chain of circles I sometimes show overlapped with the blue data dots) using dist_mod=9.53 vs dist_mod=9.72, and under the sets of estimated parameters in this HBM run, dist_mod=9.53 really does fit the data's MS more under visual inspection. I am not sure what caused this problem.

Thirdly, the estimation on Y has gone back down to be very close to the BBN helium fraction again. The previous few runs resulted in estimates of Yinit~0.250, which was at least slightly away from the BBN helium fraction value.

Fourthly, the estimates for spread Av went up while that for f_G (ie. the GAIA G band obs error multiple) went down. This caused the mixture model to basically not function: Constraining the prior on spread Av to be closer to 0 certainly will help this situation, but I don't think that is proper practice. I really am not sure.

I am quite lost about what to do with this. Perhaps the fact that improving the MIST NN leads to completely different estimation results and inference just means that I have been using a defective tool all along and I am only starting to work on the real deal now. Don't know. Got any ideas?

HinLeung622 commented 4 years ago

@grd349 For reference, the current model:

model = pm.Model()
with model:

    #cluster-wide fundamentals
    Age_mu = pm.Deterministic('mean_age',pm.Beta('age_beta',2,2)*4+2)
    feh_mu = pm.Deterministic('mean_feh',pm.Beta('feh_beta',2,2)*0.3-0.15)
    Y_mu = pm.Deterministic('mean_Y',pm.Beta('Y_beta',2,2)*0.0334+0.2466)#0.24668 \pm 0.00007 is big bang nucleosynthesis fraction from Plank mission
    MLT_mu = pm.Deterministic('mean_MLT',pm.Beta('MLT_beta',10,10)*0.6+1.7)

    #per star fundamentals
    M = pm.Deterministic('mass', pm.Beta('mass_beta',1.1,1.1,shape=N)*(1.2-0.8)+0.8)
    Age = pm.Deterministic('age',T.ones(N)*Age_mu)
    feh = pm.Deterministic('feh',T.ones(N)*feh_mu)
    Y = pm.Deterministic('Y',T.ones(N)*Y_mu)
    MLT = pm.Deterministic('MLT',T.ones(N)*MLT_mu)

    #NN calculation
    obs = pm.Deterministic('obs',m1.manualPredict(T.log10([M, Age, 10**feh, Y, MLT])))

    #intermediate observables
    radius = pm.Deterministic('radius', 10**obs[0])
    Teff = pm.Deterministic('Teff', (10**obs[1])*5000)
    L = pm.Deterministic('L', (radius**2)*((Teff/Teff_sun)**4))
    logg = pm.Deterministic('logg', T.log10(100*constants.G.value*(M/radius**2)*(constants.M_sun.value/constants.R_sun.value**2)))
    true_s_feh = pm.Deterministic('true_s_feh', obs[3])
    obs_s_feh = pm.Normal('obs_s_feh', true_s_feh, M67['[Fe/H]_err'], observed=M67['[Fe/H]'])

    #extinction prior
    A_V_mu = pm.Deterministic('mean_A_V', pm.Beta('Av_beta',1.1,1.1)*0.2) # X & Y are 'sensible' values
    A_V_std = pm.Lognormal('spread_A_V', T.log(0.01), 1.0) # Different X & Y but still sensible
    A_V_ = pm.Normal('A_V_', 0, 1, shape=N) # an N(0, 1) dist to be scaled
    A_V = pm.Deterministic('A_V', A_V_ * A_V_std + A_V_mu)

    #second NN calculation
    BCs = pm.Deterministic('BCs', t1.manualPredict(T.as_tensor_variable([T.log10(Teff), logg, true_s_feh, A_V])))

    #BCs
    BCg = BCs[0,:]
    BCbp = BCs[2,:]
    BCrp = BCs[3,:]
    BCk = BCs[4,:]

    #distance and parallax
    dist_mod_ = pm.Deterministic('dist_mod', dist_mod+pm.Normal('dist_mod_norm', 0, 1.0))
    true_parallax = pm.Deterministic('true_parallax', 10**(-dist_mod_/5-1)*1000)
    obs_adj_parallax = pm.Normal('obs_adj_parallax', true_parallax, np.sqrt(M67_parallax[1]**2+(M67_zero_point[1]/1000)**2), 
                             observed=M67_parallax[0]-M67_zero_point[0]/1000)

    #true observables
    true_mG = pm.Deterministic('true_mG', -2.5*T.log10(L)+Mbol-BCg+dist_mod_)
    true_mK = pm.Deterministic('true_mK', -2.5*T.log10(L)+Mbol-BCk+dist_mod_)
    true_Bp_Rp = pm.Deterministic('true_Bp_Rp', BCrp - BCbp)

    #dealing with multiples
    q = pm.Beta('q', 1.1,1.1, testval=0.8)
    delta_mG = pm.Lognormal('delta_mG', T.log(0.3), 0.4)
    f_G = pm.Lognormal('f_G', T.log(10),0.7)
    sigma_multiple_G = pm.Lognormal('sigma_multiple_G', T.log(0.3), 0.3)
    dist_singular_G = pm.Normal.dist(0, np.mean(M67['g_mag_err'])*f_G)
    dist_multiple_G = pm.Normal.dist(-delta_mG, sigma_multiple_G)

    delta_mK = pm.Lognormal('delta_mK', T.log(0.3), 0.4)
    f_K = pm.Lognormal('f_K', T.log(10),0.7)
    sigma_multiple_K = pm.Lognormal('sigma_multiple_K', T.log(0.3), 0.3)
    dist_singular_K = pm.Normal.dist(0, np.mean(M67['K_mag_err'])*f_K)
    dist_multiple_K = pm.Normal.dist(-delta_mK, sigma_multiple_K)

    #obs observables
    obs_mG = pm.Mixture('obs_mG', w=[q, 1-q], comp_dists = [dist_singular_G, dist_multiple_G], \
                    observed=M67['g_mag'].values - true_mG)
    obs_mK = pm.Mixture('obs_mK', w=[q, 1-q], comp_dists = [dist_singular_K, dist_multiple_K], \
                    observed=M67['K_mag'].values - true_mK)
    obs_Bp_Rp = pm.Normal('obs_Bp_Rp', mu=true_Bp_Rp, sigma=M67['Bp_Rp_err'], observed=M67['Bp_Rp'])

HinLeung622 commented 4 years ago

@grd349 Just made this crazily large of a pgm for the current state of our model, thought you would like it: With some annotation: Note: zeropoint offset is omitted

grd349 commented 4 years ago

Oh my gosh - do I like it? No

. . . . . . . . . .

I love it! 😊 😍 🥰 Great work @HinLeung622 :)

grd349 commented 4 years ago

Honestly - you don't know how happy that has made me. I really want to share this. As soon as you paper is submitted we are tweeting this! :)

HinLeung622 commented 4 years ago

@grd349 Haha, thanks! Glad you liked it.

Now back to the problems, I have done a range of testing over the weekend. There is a lot to talk about, which will be best suited to be done over a meeting, but in short: it is not looking good. Here are the main points:

fixing the observational magnitudes' error multiples on both G and K did nothing to help the age-extinction degeneracy
Clamping down (using extremely informative prior) on initial Y, feh and MLT did little to the age estimate other than further increasing mean Av and enlarging spread Av so much so that it overpowers the mixture model and binary fraction is estimated to be 0
Further clamping mean Av to a close-to-literature value did near to no effects
Further swapping out the log-normal prior on spread Av into a beta function that has a high limit that roughly translates to the width of the MS in the CMD still yields a rather low age estimate (2.5Gyr), though the mixture model is functional again. This suggests that our stars simply do not have enough info to even determine between a 2Gyr and 4Gyr cluster. Though the still super low mean age uncertainty (0.2 Gyr) does not reflect this point.

Turning to NGC188:

I turned to test out this scheme on NGC188, the cluster that is supposed to perform better. Releasing its age prior from 5.6-7.6 to 4-8 Gyr, and keeping the relatively uninformative priors on the other fundamentals and Av components, yielded a 4.4Gyr estimate, with a much-too-wide spread Av and near-zero binary fraction, ie. dysfunctional mixture model
I again restricted the spread Av to something that covers only the width of the MS, and a informative close-to-literature mean Av prior, then it gave a 4.7Gyr age and a functional mixture model.

Observations:

There seems to be a disagreement between the shape of the MS coming out from our NNs' predictions compared to the data. In both G and Ks bands, the data has to rotate clockwise by some significant degree to match some combination of fundamentals' prediction, and I have tried to match the fundamentals' predictions to the data with changing the values, but have not succeeded, eg:
NGC188 performed really well previously on the older (and more highly biased) MIST NN, before the introduction of Ks band and the more streamlined mixture model. Do you think this could be a case of my MIST NN somehow miraculously biased the model in such a way that skewed the predictions to better match the real-world, and now with a better produced NN mapping, we are just facing model-data mismatch more face-on? Not sure

Anyways, do you have time for a chat?

HinLeung622 commented 4 years ago

@grd349 Yes, it is basically confirmed that the new iteration of MIST NN skewed the BC predictions: test6 = the old NN without Ks band and higher biases atmos2BC_withK4_1 = the new NN with Ks band and lower biases The new NN did introduce a clockwise rotation to the predicted values, which disagrees with data.

grd349 commented 4 years ago

Well that scares me! That's a pretty significant problem no?

HinLeung622 commented 4 years ago

@grd349 Not sure if it is already a problem per se, since my claim about the model-data mismatch might be due to some other mistake I made. Did you read my previous reply on this issue? Would you have time for a meeting?

grd349 commented 4 years ago

Could we meet Monday next week?

G

Dr Guy R. Davies PI - ERC CartographY Project Senior Lecturer in Astrophysics School of Physics and Astronomy The University of Birmingham Edgbaston Birmingham B15 2TT

Tel +44 (0) 121 414 4597 G.R.Davies@bham.ac.uk grd349@gmail.com davies@bison.ph.bham.ac.u davies@bison.ph.bham.ac.ukk davies@bison.ph.bham.ac.uk

The contents of this e-mail may be privileged and are confidential. It may not be disclosed, used, or copied in any way by anyone other than the addressee. If received in error please notify the sender then delete it from your system. Should you communicate with the sender by e-mail, you consent to The University of Birmingham monitoring and reading any such correspondence.

On Tue, 10 Nov 2020 at 13:47, HinLeung622 notifications@github.com wrote:

@grd349 https://github.com/grd349 Not sure if it is already a problem per se, since my claim about the model-data mismatch might be due to some other mistake I made. Did you read my previous reply on this issue? Would you have time for a meeting?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Harry-Westwood/Y4-Project-InterNeuralStellar/issues/50#issuecomment-724712249, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQZOCPI2ZG4WDE6M2NAKA3SPFAAVANCNFSM4SIO725Q .

HinLeung622 commented 4 years ago

@grd349 Thanks, Monday will work fine for the both of us.

Also, the presentation went well, one listener said that multiple talks have tried to explain HBM but she things she understood the most this time, so that is a plus.

grd349 commented 4 years ago

Great job on the talk then! :)

Monday is now a problem for me as I have kids to look after. Could we do Tuesday instead?

G

Dr Guy R. Davies PI - ERC CartographY Project Senior Lecturer in Astrophysics School of Physics and Astronomy The University of Birmingham Edgbaston Birmingham B15 2TT

Tel +44 (0) 121 414 4597 G.R.Davies@bham.ac.uk grd349@gmail.com davies@bison.ph.bham.ac.u davies@bison.ph.bham.ac.ukk davies@bison.ph.bham.ac.uk

The contents of this e-mail may be privileged and are confidential. It may not be disclosed, used, or copied in any way by anyone other than the addressee. If received in error please notify the sender then delete it from your system. Should you communicate with the sender by e-mail, you consent to The University of Birmingham monitoring and reading any such correspondence.

On Thu, 12 Nov 2020 at 14:56, HinLeung622 notifications@github.com wrote:

@grd349 https://github.com/grd349 Thanks, Monday will work fine for the both of us.

Also, the presentation went well, one listener said that multiple talks have tried to explain HBM but she things she understood the most this time, so that is a plus.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Harry-Westwood/Y4-Project-InterNeuralStellar/issues/50#issuecomment-726128322, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQZOCJ3LXC22C2LVP5AOCTSPPZQBANCNFSM4SIO725Q .

HinLeung622 commented 4 years ago

@grd349 Tuesday is fine except for 1-2 pm BST

grd349 commented 4 years ago

Thanks - could you tell Harry Tuesday at 15:00 GMT?

Dr Guy R. Davies PI - ERC CartographY Project Senior Lecturer in Astrophysics School of Physics and Astronomy The University of Birmingham Edgbaston Birmingham B15 2TT

Tel +44 (0) 121 414 4597 G.R.Davies@bham.ac.uk grd349@gmail.com davies@bison.ph.bham.ac.u davies@bison.ph.bham.ac.ukk davies@bison.ph.bham.ac.uk

The contents of this e-mail may be privileged and are confidential. It may not be disclosed, used, or copied in any way by anyone other than the addressee. If received in error please notify the sender then delete it from your system. Should you communicate with the sender by e-mail, you consent to The University of Birmingham monitoring and reading any such correspondence.

On Fri, 13 Nov 2020 at 15:56, HinLeung622 notifications@github.com wrote:

@grd349 https://github.com/grd349 Tuesday is fine except for 1-2 pm BST

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Harry-Westwood/Y4-Project-InterNeuralStellar/issues/50#issuecomment-726843332, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQZOCKVKL4R6ATEYCV7N63SPVJKZANCNFSM4SIO725Q .

HinLeung622 commented 4 years ago

@grd349 no problem, it is done.

HinLeung622 commented 3 years ago

@grd349 I had the following idea: What if I strip away all the binary stars of M67 as a test, which will allow me to do max pooling on everything without a potentially-problematic mixture model. And this test can truly reveal whether our model setup and data combination actually can give a sensible age estimate.

So now left with 295 stars, I tried a max pool model for everything including Av, with the same priors as before: good convergence, but age is very low (2.06) with small uncertainty (0.046). Av is very low (0.048) and MLT is very low (1.74) but somehow this combination of fundamentals is able to give a CMD that agrees with the data: sampling_regions Seeing the strange Av values and MLT, I decided to help the sampler with very informative priors on Y, MLT and Av, meaning only age, dist_mod and feh are on the same uninformative priors as before: great convergence, but age is still low (2.34) despite fixing a lot of stuff already. The feh went a bit high, so I decided to fix feh and additionally dist_mod as well: And so with super informative priors on everything but age and mass, without binary stars that require a fancy mixture model, without Av spread, the estimated age is 2.9Gyr with a very small uncertainty (0.032). I have no idea why the sampler keep preferring a lower age to higher age when in theory both should fit the data equally well (if we are assuming age has little effect to the MS). My only thought is that the matching set of masses of a lower age results in a MS slope that is more comparable to the data, while that of a higher age disagrees more, so the sampler favours a low age?

Hence, my next thing to try is to add a mass-dependent partial pool MLT onto this. The same thing is already running on the regular full M67 and NGC188 sets, but results are yet to come out.

Harry-Westwood / Y4-Project-InterNeuralStellar

Helping HBM sample better (cont. cont.) #50