Harry-Westwood / Y4-Project-InterNeuralStellar

Bayesian Hierarchical Modelling and Machine Learning of Stellar Populations
1 stars 0 forks source link

Helping HBM sample better #45

Closed HinLeung622 closed 4 years ago

HinLeung622 commented 4 years ago

@grd349 The last issue thread was getting very long so I opt for starting a new one, since we are (kind of) pass that ravel equals to zero error.

Last night I ran 2 HBM runs:

  1. Max pool model with 100 M67 dwarfs taken just under the hook: image The red large dots represent a "max pool" generated cluster with masses from 0.8 to 1.2, which means this dataset includes stars with mass > 1.2 Msun. The prior on mass hence runs from 0.8 to 1.5 g mag obs error = x100, tuning steps = 7000, sampling steps = 2000 image image Sampler not converged, and oddly, the mean value results from this max pool model are somewhat different from the partial pool and the ones we got for the report, especially for Y.

  2. Max pool model with 100 M67 dwarfs taken considerably lower luminosity under the hook: image Red dots are of the exact same generated cluster set up, so this sample should be with stars only with mass 0.8-1.2 Msun. The HBM model for this run is identical to the last except for shrinking the mass prior to 0.8 to 1.2 g mag obs error = x100, tuning steps = 7000, sampling steps = 2000 image image Somehow rather worse sampling results, and very strangely, the posteriors on mass goes up above 1.2. Maybe I made some mistake that I am yet to be able to locate, but I am not sure what happened there.

This test was meant to check if discarding the >1.2Msun dwarfs would result in a much worse estimate in the cluster-wide fundamentals, but the samplers are still having a hard time sampling, so I don't think the test answers the question yet.

I have started an additional run identical to no.2 except for using beta(10,10) on all priors, will report when that gives results.

HinLeung622 commented 4 years ago

@grd349 New runs, exactly the same setup as above except for using beta(10,10) priors in place of the old beta(1.1,1.1) ones, and lowering the number of tuning steps from 7000 to 2000 since it takes much longer to run on colab.

  1. 100 M67 dwarfs that are estimated to have mass 0.8-1.35Msun: image image 3 chains agree with each other and 1 chain does not. However, the age, Y and MLT are estimated to be much lower than previous results in this run, which seems to suggest some problems happening within the sampling process.

  2. 100 M67 dwarfs that are estimated to have mass 0.8-1.2Msun: image image Much better sampling here, and the estimated values are slightly higher, closer to previous estimates.

Since the sampler in the higher mass sample did not technically converge, this test still might not be a strong enough test regarding the effect in the fundamentals' estimation uncertainties of samples with and without the 1.2-1.35Msun group. However, I think it is safe to say that the sampler has a much better time working on the lowest 0.8-1.2Msun mass stars.

(Also, maybe these results suggest dwarfs-only HBM runs result in lower age estimates than runs with subgiants (ie. back during our report stage)? Probably not enough evidence about that yet.)

grd349 commented 4 years ago

@HinLeung622 Have you seen this work? https://arxiv.org/pdf/1705.06761.pdf

HinLeung622 commented 4 years ago

@grd349 Some small update: I have been able to get my old functionalities using bluebear back up myself, so transferring the old documents will not be necessary.

HinLeung622 commented 4 years ago

@grd349 Also, some results of a rerun with the lower mass dwarfs of M67 on beta(10,10) distribution: image image Interestingly, the posterior has shifted to lower age and helium compared to the last successful run, despite having all four chains consistent with each other. Though the estimations are still within 2 sigmas of the previous estimation values, so I guess it still passes as simple variations in measurements.

Also, should I keep using beta(10,10) distribution? Cause I am pretty sure the priors are best to be loosely informative, and beta(10,10) might be a bit too extreme(?)

HinLeung622 commented 4 years ago

@grd349

  1. A run of low mass M67 dwarfs with beta(10,10) distributions, now with all 295 available stars: image image image Again, similar to the results above, much lower age and Y estimations than previous results. A reassuring finding is that the estimation uncertainties on all 4 fundamentals decreased thanks to the larger sample size. So perhaps it really is the case that only running on low mass dwarfs cause lower age and Y estimations?

  2. I went ahead and ran on the dwarfs of NGC188, with beta(10,10) priors, for 445 stars: image image image image For reference, this is our estimates previously using subgiants+main sequence: image And literature values: image First of all, great convergence, but the same situation seems to be happening, the age and Y estimations seems to be lower than previous estimations and literature values.

  3. I jumped the gun and tried making a binary mixture model myself, can you check if this is the correct way of implementing it:

    model = pm.Model()
    with model:
    
    Age_mu = pm.Deterministic('mean_age',pm.Beta('a',10,10)*2+2.5)
    feh_mu = pm.Deterministic('mean_feh',pm.Beta('e',10,10)*0.4-0.2)
    Y_mu = pm.Deterministic('mean_Y',pm.Beta('f',10,10)*0.04+0.24)
    MLT_mu = pm.Deterministic('mean_MLT',pm.Beta('g',10,10)*0.6+1.7)
    
    M = pm.Deterministic('mass', pm.Beta('d',10,10,shape=N)*(1.33-0.8)+0.8)
    Age = pm.Deterministic('age',T.ones(N)*Age_mu)
    feh = pm.Deterministic('feh',T.ones(N)*feh_mu)
    Y = pm.Deterministic('Y',T.ones(N)*Y_mu)
    MLT = pm.Deterministic('MLT',T.ones(N)*MLT_mu)
    
    obs = pm.Deterministic('obs',m1.manualPredict(T.log10([M, Age, 10**feh, Y, MLT])))
    
    radius = pm.Deterministic('radius', 10**obs[0])
    Teff = pm.Deterministic('Teff', (10**obs[1])*5000)
    Q = pm.Uniform('binary_frac', lower=0.0, upper=0.5, shape=N)
    L = pm.Deterministic('L', (1-Q)*(radius**2)*((Teff/Teff_sun)**4)+Q*2*(radius**2)*((Teff/Teff_sun)**4))
    logg = pm.Deterministic('logg', T.log10(100*constants.G.value*(M/radius**2)*(constants.M_sun.value/constants.R_sun.value**2)))
    Av_list = pm.Deterministic('Av', T.ones(N)*Av)
    
    BCs = pm.Deterministic('BCs', t1.manualPredict(T.as_tensor_variable([T.log10(Teff), logg, feh, Av_list])))
    
    BCg = pm.Deterministic('BCg', BCs[5,:])
    BCbp = pm.Deterministic('BCbp', BCs[7,:])
    BCrp = pm.Deterministic('BCrp', BCs[8,:])
    
    true_mG = pm.Deterministic('true_mG', -2.5*T.log10(L)+Mbol-BCg+dist_mod)
    true_Bp_Rp = pm.Deterministic('true_Bp_Rp', BCrp-BCbp)
    
    obs_mG = pm.Normal('obs_mG', true_mG, M67['g_mag_err']*100, observed=M67['g_mag'])
    obs_Bp_Rp = pm.Normal('obs_Bp_Rp', true_Bp_Rp, M67['Bp_Rp_err'], observed=M67['Bp_Rp'])

    The main point is just the Q and L lines, I am setting Q as the binary fraction, with a uniform distribution between 0 and 1, and a shape of N stars.

Results: image image image image Note: for the last plot, only 100 random stars were picked for the sampling, so not all blue dots are sampled for, hence not having corresponding red dots. So it seems the mixture model is working for binaries. But just as before, it seems the mean age is even lower and MLT higher than the previous runs without binaries.

Thoughts?

HinLeung622 commented 4 years ago

@grd349 pymc3 dev replied: https://github.com/pymc-devs/pymc3/issues/4038

grd349 commented 4 years ago

@HinLeung622

Great job - I see the conversation is still on going.

I thought that we were working in float64! Rolls eyes. How did you set the precision in the end?

Excellent work - don't forget to thank the pymc3 dev people for their help :)

HinLeung622 commented 4 years ago

@grd349 I set the precision by running this:

import os
os.environ["THEANO_FLAGS"] = "floatX=float64"

By the way, are you on vacation this week? If so, does it also mean no meeting tomorrow? Also, could you take a quick look at my implementation of my mixture model for binaries, to see if it makes sense? Thanks

grd349 commented 4 years ago

Super thanks @HinLeung622

I've been unwell (in the spirit of honesty, I finally burnt out after all this lockdown chaos). I think I'm going to say no meeting tomorrow, sorry.

Send me your binary mixture model and I'll look it over!

HinLeung622 commented 4 years ago

@grd349 Oh ok, yeah a break from time to time is always good. Work life balance afterall. Here is my full model with binary accounted for as well:

model = pm.Model()
with model:

    Age_mu = pm.Deterministic('mean_age',pm.Beta('a',10,10)*2+2.5)
    feh_mu = pm.Deterministic('mean_feh',pm.Beta('e',10,10)*0.4-0.2)
    Y_mu = pm.Deterministic('mean_Y',pm.Beta('f',10,10)*0.04+0.24)
    MLT_mu = pm.Deterministic('mean_MLT',pm.Beta('g',10,10)*0.6+1.7)

    M = pm.Deterministic('mass', pm.Beta('d',10,10,shape=N)*(1.33-0.8)+0.8)
    Age = pm.Deterministic('age',T.ones(N)*Age_mu)
    feh = pm.Deterministic('feh',T.ones(N)*feh_mu)
    Y = pm.Deterministic('Y',T.ones(N)*Y_mu)
    MLT = pm.Deterministic('MLT',T.ones(N)*MLT_mu)

    obs = pm.Deterministic('obs',m1.manualPredict(T.log10([M, Age, 10**feh, Y, MLT])))

    radius = pm.Deterministic('radius', 10**obs[0])
    Teff = pm.Deterministic('Teff', (10**obs[1])*5000)
    Q = pm.Uniform('binary_frac', lower=0.0, upper=0.5, shape=N)
    L = pm.Deterministic('L', (1-Q)*(radius**2)*((Teff/Teff_sun)**4)+Q*2*(radius**2)*((Teff/Teff_sun)**4))
    logg = pm.Deterministic('logg', T.log10(100*constants.G.value*(M/radius**2)*(constants.M_sun.value/constants.R_sun.value**2)))
    Av_list = pm.Deterministic('Av', T.ones(N)*Av)

    BCs = pm.Deterministic('BCs', t1.manualPredict(T.as_tensor_variable([T.log10(Teff), logg, feh, Av_list])))

    BCg = pm.Deterministic('BCg', BCs[5,:])
    BCbp = pm.Deterministic('BCbp', BCs[7,:])
    BCrp = pm.Deterministic('BCrp', BCs[8,:])

    true_mG = pm.Deterministic('true_mG', -2.5*T.log10(L)+Mbol-BCg+dist_mod)
    true_Bp_Rp = pm.Deterministic('true_Bp_Rp', BCrp-BCbp)

    obs_mG = pm.Normal('obs_mG', true_mG, M67['g_mag_err']*100, observed=M67['g_mag'])
    obs_Bp_Rp = pm.Normal('obs_Bp_Rp', true_Bp_Rp, M67['Bp_Rp_err'], observed=M67['Bp_Rp'])
grd349 commented 4 years ago

@HinLeung622

I'm lost - how does this account for binary (btw, we should start calling this double stars rather than binarity as stars may be foreground/background rather than orbiting each other (but they may be orbiting)).

obs_mG = pm.Normal('obs_mG', true_mG, M67['g_mag_err']*100, observed=M67['g_mag'])

I would say this is where the mixture model part should go.

HinLeung622 commented 4 years ago

@grd349 I did the mixture model part here:

Q = pm.Uniform('binary_frac', lower=0.0, upper=0.5, shape=N)
L = pm.Deterministic('L', (1-Q)*(radius**2)*((Teff/Teff_sun)**4)+Q*2*(radius**2)*((Teff/Teff_sun)**4))

So I did it on luminosity instead of observational g mag.

grd349 commented 4 years ago

Oh - I see. Not quite. My fault - I obviously didn't explain it well.

We want

$P(mG) = Q P(mG_single) + (1 - Q) P(mG_double)$,

where you have

$L = Q L + (1 - Q) @ * L$,

We need to operate on the probability distributions rather than the parameters themselves. I think the best thing to do is have you look over the notebook I sent across.

grd349 commented 4 years ago

I would also suggest working on the apparent magnitude rather than the luminosity. This is more like the real case as what we really have is incorrect measurements of apparent mag due to binarity. We are of course ignoring the impact of Bp-Rp from binarity but the way we are setting things up should allow binary stars to be outliers which make little difference to the main results.

HinLeung622 commented 4 years ago

@grd349 Oh ok, sorry I only just saw the emails you sent. I will go through the models and papers and update you when I have something new.

HinLeung622 commented 4 years ago

@grd349 (you can reply on another day if you want to be walking away right now) I don't think I 100% understood your example, but I took what I understood and made this:

    true_mG = pm.Deterministic('true_mG', -2.5*T.log10(L)+Mbol-BCg+dist_mod)
    true_Bp_Rp = pm.Deterministic('true_Bp_Rp', BCrp-BCbp)

    q = pm.Beta('q', 7, 2, testval=0.7)
    dist_singular = pm.Normal.dist(true_mG, M67['g_mag_err'])
    dist_multiple = pm.Normal.dist(true_mG-0.4, 0.1)

    obs_mG = pm.Mixture('obs_mG', w=[q, 1-q], comp_dists = [dist_singular, dist_multiple], \
                        observed=M67['g_mag'])
    obs_Bp_Rp = pm.Normal('obs_Bp_Rp', true_Bp_Rp, M67['Bp_Rp_err'], observed=M67['Bp_Rp'])

Do you think this makes sense? I am not sure if using true_mG in the mean of the mixture distributions, and the data's observational errors in the sigma of dist_singular is a good idea. Would this create N-shaped mixture distributions instead of the singular distribution in your example?

Also, I am not quite sure what does the sigma value (in this case 0.1) of dist_multiple represent. Is it the variations given to account for not all double stars have to be 1:1 brightness? How should I incorporate obs error into those stars?

grd349 commented 4 years ago

Hi @HinLeung622

Assuming this compiles (which would be a little bit of a surprise but pymc3 always surprises me) then this doesn't look ridiculous :)

I would do the following ...

` true_mG = pm.Deterministic('true_mG', -2.5*T.log10(L)+Mbol-BCg+dist_mod) true_Bp_Rp = pm.Deterministic('true_Bp_Rp', BCrp-BCbp)

q = pm.Beta('q', 7, 2, testval=0.7)
dist_singular = pm.Normal.dist(0, np.mean(M67['g_mag_err']))
dist_multiple = pm.Normal.dist(0.75, 0.2) # 0.2 is the width of the outlier pop in mag

obs_mG = pm.Mixture('obs_mG', w=[q, 1-q], comp_dists = [dist_singular, dist_multiple], \
                    observed=M67['g_mag'] - true_mG)
obs_Bp_Rp = pm.Normal('obs_Bp_Rp', true_Bp_Rp, M67['Bp_Rp_err'], observed=M67['Bp_Rp'])`

0.75 is the correct(ish) value to use here. The width of 0.2 is arbitrary but probably not completely wrong.

Does that make sense?

HinLeung622 commented 4 years ago

@grd349 I actually used your current setup first, meaning doing observed=M67['g_mag'] - true_mG and keeping the mean values in the distributions as pure numbers, but that errored on the compile step with the error length not known: Elemwise{sub,no_inplace} [id A] ''followed by a super long printout of the theano structure of the model. It is only when I take away the - true_mG from the observed bit of the pm.Mixture line that it compiles.

grd349 commented 4 years ago

observed=M67['g_mag'].values - true_mG

perhaps - not sure why that doesn't work.

HinLeung622 commented 4 years ago

@grd349 Yeah, I tried googling a bunch but this seems to be a very specific problem. Ok, I will try that

HinLeung622 commented 4 years ago

@grd349 Haha, your simple .values fix worked

grd349 commented 4 years ago

:)

HinLeung622 commented 4 years ago

Also, I think the mean value at dist_multiple should be negative right? cause adding luminosity means reaching lower values in magnitudes.

grd349 commented 4 years ago

Doesn't depend on how you define M67['g_mag'].values - true_mG ?

HinLeung622 commented 4 years ago

@grd349 Yeah I think it should be negative. Cause the magnitudes in the double stars in M67['g_mag'] are of a smaller value than true_mG, which should be at the MS. This means M67['g_mag'].values - true_mG will be negative for doubles, so the mean of dist_multiples should be negative, correct?

HinLeung622 commented 4 years ago

@grd349 Results for 100 stars of M67, using max pool mixture model, with x100 g_mag obs errors: image image Looks like it is doing rather well. The q value came out suspiciously close to the testval 0.7. The same situation occurs as in previous dwarfs-only runs, where the age and Y estimates comes out to be much lower than literature and our report sub-giants+dwarfs values. Here is CMD of the data and the HBM true_mG + true_Bp_Rp estimates: image I guess it looks reasonable? I expected the sequence of HBM true values to be at the bottom edge of the clusters of data, but it seems to be in the middle currently. Do you think this is expected behaviour?

Sidenote: since the value of 0.75 g mag difference between the multiples and singular is rather arbitrary (though supported by literature), do you think I could (and should) turn it into a free variable for the HBM to estimate too?

HinLeung622 commented 4 years ago

@grd349 So I did a 100 star max pool mixture model run, with the magnitude difference between the multiples and singulars being a free variable 'delta_mG': image image Not sure if the results make any sense, since delta_mG came out to be much lower than the number we previously used, which seems to have caused a lower q fraction as well. Maybe the HBM is mistakenly treating some of the higher spreaded singular stars as multiples? Could this suggest I shouldn't be allowing delta_mG as a free variable? image

grd349 commented 4 years ago

@HinLeung622

It looks as though we are not quite there yet. I'm gona suggest something in full expectation that it fails. If it works I'll be stumped as to why.

Let's model the outliers using two distributions (a three dist mixture model). A normal dist centred at 0 mag for all the stars that are inliers (p.s., are the uncertainties you plot using the x100 value?). Another normal at mean ~0.75 and spread ~ 0.2 mags. A further normal dist centred at 0.2 with spread ~0.1 mag. I cannot justify this other than to suggest that is what the data look like.

The other option is we go full David van Dyke and model each system as a binary but this will create a lot more parameters. Maybe leave this for now.

Does any of the above make sense?

For the triple mixture q wiull need to be drawn from a distribution where all three q values (we used q and 1-q last time) sum to unity.

q = pm.Dirichlet('q', a=np.ones(npop))

where npop is 3 would be the way to do this in pymc3.

Wanna have a shot at that?

HinLeung622 commented 4 years ago

@grd349 I will try that. However, why do you say we are not quite there yet? is it because of the unexpectedly low age and helium estimates? I am trying a full partial pull model right now on the same data with binaries, since I remember partial pull model giving more literature-agreeing mean values than max pull back when I was still removing binaries from the data.

grd349 commented 4 years ago

I do worry about the low helium - it's sub-big bang for the most part which is concerning.

Second problem is the lower-mass stars are not well fit by the model. Even the higher mass stars look like the magnitude is being over estimated.

I think both problems can be solved by a better outlier model. (note the unrealistic optimism).

I think the [Fe/H] might be starting to go high as well but this is probably all related.

Don't worry about getting close to to the literature age - we've never really trusted this :)

grd349 commented 4 years ago

@HinLeung622 see above

HinLeung622 commented 4 years ago

@grd349 Ah, ok, yeah that makes sense. Forgot to answer your question, yes the g mag observational uncertainties were already x100 of their gaia values.

HinLeung622 commented 4 years ago

@grd349 I think I have got something interesting to show: So remembering my M67 partial pool models used to give more realistic mean age and Y estimations, I tried to put in the full partial pool (with beta(10,10) priors) into the double star mixture model, here are the results: image image The mean estimations for age and Y went back up to acceptable regions, and the mean feh went back down to 0.10 levels. The only problem is that q is of a super low value (~80% of the stars are doubles??) while delta_mG is basically 0. It seems this particular HBM run is unable to treat both the partial pool and mixture models, so the spreads in the partial pool "accounted for" the more luminous doubles, leaving delta_mG to be basically 0. image Basically the same thing is seen in this plot. Though curiously, the sequence of red dots (HBM mean guesses) and the MS of the data (blue) don't run parallel to each other. Is this possibly a sign of the stellar evolution code failing at lower masses?

The immediate improvement I can think of is fixing delta_mG to 0.75, forcing the sampler to use the mixture model to "account for" the more luminous doubles. I will also do your 3-distribution mixture suggestion (on a max pool model) next.

HinLeung622 commented 4 years ago

@grd349 An update:

  1. for the partial pool + mixture model setup above, fixing delta_mG to 0.75 did not result in a convergence of chains, I am retrying with lower values.

  2. I tried your max pool + 3 distribution mixture model idea, here are the results: (q[0]=singulars, q[1]=-0.75 g mag, q[2]=-0.4 g mag) image image image So it was a success in terms of sampling, the three distributions are very distinct. The HBM guesses are also more aligned with the high magnitude edge of the data, which was one of the concerns you raised previously. However, the super low helium and mean age are still present.

Actually, I had a thought: Perhaps for some reason (maybe something is wrong with my data reduction process) the "location" of the M67 singular dwarfs right now in the CMD suits a super low helium fraction OC more than one that is on the level of ~0.26. Here is the estimations with an older max pool model, were the data feeding into this one had all the double stars removed: image As you can see, the helium and age are still very low with just the main singular dwarfs. So perhaps is a problem with the data/data reduction?

HinLeung622 commented 4 years ago

Oh wait, let me check my Av and extinction values

HinLeung622 commented 4 years ago

@grd349 Turns out, locally, I have been using the wrong Av value to plot my "expected OC location overlay" plots (the ones below), but on bluebear, where the HBM is actually done, it is correct, so no big errors there. For reference: This is how an OC (with our trained NN for both stellar evolution and Teff -> BCs doing the conversion) looks like with fundamentals: Age = 3.5 Gyr, M = 0.8-1.2 Msun, feh = 0.1, Y = 0.255, MLT = 2.1 (basically the fundamentals for M67 estimated in our report) image And for these fundamentals: Age = 2.5 Gyr, M = 0.8-1.2 Msun, feh = 0.1, Y = 0.24, MLT = 2.1 (basically what our current max pool mixture models are predicting) image Both red regions overlap nicely with the blue data dots, meaning it is reasonable that both satisfies the data in the sampler's logic. The only major difference between the two is the first combination covers a more luminous region than the second, meaning the stars would be estimated to have lower masses than the second.

grd349 commented 4 years ago

@HinLeung622

This is looking like good progress! Excellent work :)

I'm going to label the two solutions you have to make this easier. The 3.5 Gyr w/ sensible Helium we'll call the standard result. The 2.5Gyr low Helium result we'll call the sub Helium result.

Here are my thoughts on what is causing the problem of the sub Helium.

When you remove the outliers you get a standard result. When we try and treat the outliers (double stars) we get the sub Helium result. I think the double stars are the problem. So what is the solution?

There are a few possibilities ...

  1. Improve the double star treatment (we could do what Van Dyke does which would take some Dev but it's not difficult).
  2. Continue to treat the double stars as noise but make a better noise model with mode free parameters.
  3. Remove the double stars by hand.

I hate the idea of (3). (1) would be the best way to do things but I worry about the double star impact on the colors not just the magnitudes. (2) is the most straight forward approach. Let's do (2) as a first try.

Here is how I would do it ... (either you can try this or I can do it and then share with you and you can optimise - your choice). We want a better model of the double stars - here is how I would dev that better model.

This would be a fun data-vis explore task so I'm happy to do it and share or do it together (after some prep) or even have you do it with support. Your choice :)

HinLeung622 commented 4 years ago

@grd349 I think it is possible for me to do it and you give support way, but since I have limited knowledge on this, progress will be incredibly slow when I inevitably will have to ask a whole bunch of questions. So it seems the most logical choice would be to do it together (possibly in a 1 to 1 section or a Thursday).

Also, what exactly do you mean by fitting the polynomial function? what are the x and y variables in this case?

grd349 commented 4 years ago

Great - let's do this :)

Can you send me the data for your three clusters in .csv format.

For the polynomial we'll be fitting y = Sumi=0^n m{i} * x**n where x is the color and y is the magnitude.

So Friday sounds like it is a sensible time to do this (or perhaps tomorrow morning/afternoon)?

HinLeung622 commented 4 years ago

@grd349 Don't you have all the meetings on Friday? I am free all week so tomorrow morning/afternoon works.

I will upload the .csv data onto the repo when I have prepared the ones for NGC188 and NGC6791 as well. Will ping you again then.

grd349 commented 4 years ago

Could you send me the M67 ones in the mean time?

G

Dr Guy R. Davies PI - ERC CartographY Project Senior Lecturer in Astrophysics School of Physics and Astronomy The University of Birmingham Edgbaston Birmingham B15 2TT

Tel +44 (0) 121 414 4597 G.R.Davies@bham.ac.uk grd349@gmail.com davies@bison.ph.bham.ac.u davies@bison.ph.bham.ac.ukk davies@bison.ph.bham.ac.uk

The contents of this e-mail may be privileged and are confidential. It may not be disclosed, used, or copied in any way by anyone other than the addressee. If received in error please notify the sender then delete it from your system. Should you communicate with the sender by e-mail, you consent to The University of Birmingham monitoring and reading any such correspondence.

On Mon, 10 Aug 2020 at 11:03, HinLeung622 notifications@github.com wrote:

@grd349 https://github.com/grd349 Don't you have all the meetings on Friday? I am free all week so tomorrow morning/afternoon works.

I will upload the .csv data onto the repo when I have prepared the ones for NGC188 and NGC6791 as well. Will ping you again then.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Harry-Westwood/Y4-Project-InterNeuralStellar/issues/45#issuecomment-671267960, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQZOCN6ZK5K4L6CVOBXIHDR77AVJANCNFSM4PLIRITA .

HinLeung622 commented 4 years ago

The data: https://github.com/Harry-Westwood/Y4-Project-InterNeuralStellar/blob/master/data_collection/NGC_2682/NGC_2682_post_dwarfs_binaries.csv The stellar evolution NN i am currently using (trial b7 by Harry) (inputs = [log10 mass, log10 age, initial_feh, log10 initial_Y, log10 initial_MLT], outputs = [log10 radius, log10 (Teff/5000), log10 delnu, star_feh]): https://drive.google.com/file/d/1dBkKqeOfZsEtS2UvR_1BpdOTOunXv-Ap/view?usp=sharing The Teff -> BCs NN i am currently using (trained on MIST grid) (inputs = [log10 Teff, logg, feh, Av], outputs = ['Bessell_U','Bessell_B','Bessell_V','Bessell_R','Bessell_I','Gaia_G_MAW','Gaia_BP_MAWb','Gaia_BP_MAWf','Gaia_RP_MAW']) https://github.com/Harry-Westwood/Y4-Project-InterNeuralStellar/blob/master/Hin's_files/test6.h5 For reference, here is the pymc3 max pool + mixture model:

model = pm.Model()
with model:

    Age_mu = pm.Deterministic('mean_age',pm.Beta('a',10,10)*2+2.5)
    feh_mu = pm.Deterministic('mean_feh',pm.Beta('e',10,10)*0.4-0.2)
    Y_mu = pm.Deterministic('mean_Y',pm.Beta('f',10,10)*0.04+0.24)
    MLT_mu = pm.Deterministic('mean_MLT',pm.Beta('g',10,10)*0.6+1.7)

    M = pm.Deterministic('mass', pm.Beta('d',10,10,shape=N)*(1.33-0.8)+0.8)
    Age = pm.Deterministic('age',T.ones(N)*Age_mu)
    feh = pm.Deterministic('feh',T.ones(N)*feh_mu)
    Y = pm.Deterministic('Y',T.ones(N)*Y_mu)
    MLT = pm.Deterministic('MLT',T.ones(N)*MLT_mu)

    obs = pm.Deterministic('obs',m1.manualPredict(T.log10([M, Age, 10**feh, Y, MLT])))

    radius = pm.Deterministic('radius', 10**obs[0])
    Teff = pm.Deterministic('Teff', (10**obs[1])*5000)
    L = pm.Deterministic('L', (radius**2)*((Teff/Teff_sun)**4))
    logg = pm.Deterministic('logg', T.log10(100*constants.G.value*(M/radius**2)*(constants.M_sun.value/constants.R_sun.value**2)))
    Av_list = pm.Deterministic('Av', T.ones(N)*Av)

    BCs = pm.Deterministic('BCs', t1.manualPredict(T.as_tensor_variable([T.log10(Teff), logg, feh, Av_list])))

    BCg = pm.Deterministic('BCg', BCs[5,:])
    BCbp = pm.Deterministic('BCbp', BCs[7,:])
    BCrp = pm.Deterministic('BCrp', BCs[8,:])

    true_mG = pm.Deterministic('true_mG', -2.5*T.log10(L)+Mbol-BCg+dist_mod)
    true_Bp_Rp = pm.Deterministic('true_Bp_Rp', BCrp-BCbp)

    #q = pm.Beta('q', 7, 2, testval=0.7)
    q = pm.Dirichlet('q', a=np.ones(3))
    #delta_mG = pm.Normal('delta_mG', 0.75, 0.2, testval=0.75)
    dist_singular = pm.Normal.dist(0, M67['g_mag_err']*100)
    dist_multiple = pm.Normal.dist(-0.75, 0.2)
    dist_C = pm.Normal.dist(-0.2, 0.1)

    obs_mG = pm.Mixture('obs_mG', w=q, comp_dists = [dist_singular, dist_multiple, dist_C], \
                        observed=M67['g_mag'].values-true_mG)
    obs_Bp_Rp = pm.Normal('obs_Bp_Rp', true_Bp_Rp, M67['Bp_Rp_err'], observed=M67['Bp_Rp'])

I will include our script of functions used for reading the NN models, and also my current example notebook later

HinLeung622 commented 4 years ago

@grd349

grd349 commented 4 years ago

@HinLeung622

Are you using the A_G values? It looks to me like the A_G values for the double stars are a long way from 'correct'.

HinLeung622 commented 4 years ago

@grd349 No I am not using the A_G values, I am using a uniform Av value for the whole cluster derived from Green's 2019 dustmap. Distance to the cluster is calculated as a simple weighted mean of all the available distances of the individual member stars (before it is trimmed down to only dwarfs). The distances are calculated by Bailer-Jones through a NN process.

HinLeung622 commented 4 years ago

@grd349 By the way, NGC6791's dwarfs look like this: image There isn't much of a binary band to speak of

grd349 commented 4 years ago

@HinLeung622

Great! :). That's the way to do it for now. Later we might turn this into a learnt prior but we're good for now! :)

grd349 commented 4 years ago

Hin_data

bp_double_problem

Here is the evidence that the Bp is compromised for the double stars. Color is the quote A_G which nicely picks out the double stars (which is why the quoted A)G are wrong). You can see all the double stars lie well off the G_mag bp_mag line which means the second star is modifying the Bp differently to the G (as I think is expected). We'll carry on treating the double stars as noise rather than formally modelling them lik van Dyke.

grd349 commented 4 years ago

That's because the MS is near vertical so everything is hidden :). There are some double contributions above the SG phase.

Dr Guy R. Davies PI - ERC CartographY Project Senior Lecturer in Astrophysics School of Physics and Astronomy The University of Birmingham Edgbaston Birmingham B15 2TT

Tel +44 (0) 121 414 4597 G.R.Davies@bham.ac.uk grd349@gmail.com davies@bison.ph.bham.ac.u davies@bison.ph.bham.ac.ukk davies@bison.ph.bham.ac.uk

The contents of this e-mail may be privileged and are confidential. It may not be disclosed, used, or copied in any way by anyone other than the addressee. If received in error please notify the sender then delete it from your system. Should you communicate with the sender by e-mail, you consent to The University of Birmingham monitoring and reading any such correspondence.

On Mon, 10 Aug 2020 at 11:30, HinLeung622 notifications@github.com wrote:

@grd349 https://github.com/grd349 By the way, NGC6791's dwarfs look like this: [image: image] https://user-images.githubusercontent.com/56083013/89773950-7ae91000-db37-11ea-9c02-5238277c40d2.png There isn't much of a binary band to speak of

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Harry-Westwood/Y4-Project-InterNeuralStellar/issues/45#issuecomment-671278770, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQZOCPZAISAKN7QOPJG3MTR77D4BANCNFSM4PLIRITA .

HinLeung622 commented 4 years ago

@grd349 data for NGC188: https://github.com/Harry-Westwood/Y4-Project-InterNeuralStellar/blob/master/data_collection/NGC_188/NGC_188_post_dwarfs_binaries.csv data for NGC6791: https://github.com/Harry-Westwood/Y4-Project-InterNeuralStellar/blob/master/data_collection/NGC_6791/NGC_6791_post_dwarfs_binaries.csv

Values for the clusters: M67: Av = 0.160758, dist_mod = 9.660732908839677 NGC188: Av = 0.321516, dist_mod = 11.314471089913617 NGC6791: Av = 0.482274, dist_mod = 12.924371204053639

Here is how I have been doing the HBM calculations: https://github.com/Harry-Westwood/Y4-Project-InterNeuralStellar/blob/master/Hin's_files/M67HBM_post_testing.ipynb And the script that holds the functions that handle the NNs: github.com/Harry-Westwood/Y4-Project-InterNeuralStellar/blob/master/neuralStellar2_1.py Apologies for bad coding practises, do ask if things are too confusing.

So about the G mag vs BP mag plot above, you are using that to show that binaries not only change the G mags, but also affect Bp and Rp separately? So we are technically not accounting for 100% of the situation when we only do the mixture model on the g mag? Also, why would the A_Gs be much higher for binaries? due to extinction provided by their companion stars?

About NGC6791, so you mean we still have to use a mixture model to account for the binaries despite the binaries overlapping completely with the higher mass singulars?