Closed HinLeung622 closed 4 years ago
That age range looks a bit narrow now. Should it be beta(X, X) * 3 + 2 ?
@grd349 Latest result (with the implementation of everything above, tuning = 2000, sample = 500): As expected, shrinking the multiple of the obs G mag error would cause the HBM to have a harder sampling time. Samplers have not converged. Should I increase the tuning steps or should we loosen the multiples on the obs errors a bit? (Do note the much larger obs errors (black bars) on Bp-Rp compared to G mag. (the bars are already multiplied by the current multiple:10))
@grd349 You think the age prior should be wider? why?
Oh also, the above result means directly setting the singular distributions to have sigmas of an array works
As best my eyes can tell everything else looks good.
Great work! :)
That hand drawn PGM is epic. I love it! It's virtually art :)
@grd349 Thanks. Could you take a look at the latest result above?
@grd349 I am happy to say that I believe I found a sensible solution to the mess.
The last time I talked to you, we were using the three bands individually to be constrained by gaia data. This in general resulted in a problem: Our important observable Bp-Rp is given much more wiggle room than before due to the enlarged obs errors on Bp and Rp. This problem persisted even after I removed the "multiples of obs error" setting and just gave each band a free-varying obs error value. My solution here, is to use 2 bands individually (G and Bp), plus Bp-Rp, where Bp-Rp is calculated directly from the different BCs and is not affected at all by the noise models. I believe this reintroduces the stronger "focus" on getting the Bp-Rp value close to the observed values for the HBM, while the added Bp band helps with lifting helium and age estimates and pressing down the low mass end of the HBM guess so it aligns with the single star sequence.
Enough talking, here are the results With all stars in M67 sample, max pool for all 4 fundamentals (I dropped MLT and Av partial pool, and fixed mod to simplify the model for debugging), 2000 tuning steps: Estimated Age and initial Y are sensible, Rhats are not terribly high, q and delta_mG values are sensible. Note that mag_err[0]=esimated G mag obs err and mag_err[1]=estimated Bp mag obs err, ie the spread on the singles distribution in the noise model. These numbers will go down when I reintroduce partial pooling in MLT and Av. The estimated populations in the last 2 figures do roughly overlap with the data's, so it looks promising.
I will proceed to reintroduce Av and dist mod, then partial pool MLT.
@grd349 An update on the situation since I have tried everything I have in mind to improve on the sampling results.
So, thoughts?
P.S. I have yet to contact Oli since the current state of the HBM results is probably still not ready for yet another dimension to be introduced into it.
@grd349 Results for the model we discussed yesterday:
The model: fixed age at 4.0 Gyr, fixed initial feh at 0.05, fixed initial Y at 0.255, free max pool MLT, fixed dist mod, partial pool Av, fixed delta_mG values, fixed widths for the two binary tracks, using the results from your noise model. (I did not fix q since I am not sure how to give a Dirichlet fixed list)
@grd349 Results for fixed max pool all fundamentals, partial pool Av with fixed mean, fixed dist_mod and fixed noise model: (meaning only mass and spread Av are allowed to vary) This is quite an absolute disaster. I suspect it might have something to do with the very small observational errors on the observed mags we gave the sampler (0.001 for all three bands). Note: the third plot (CMD with both data and HBM guess) uses 0.001 mags to plot its error bars on the blue dots. They are not visible in this plot.
Another thing to note is that the estimated mean spread Av resulted to be 0.07, which is quite far from the peak of the original prior of pm.Lognormal('spread_A_V', T.log(0.01), 0.1)
.
I will try reruning the HBM but with a higher obs error for the magnitudes, maybe something like 0.01, but I doubt it will help much, if any.
In any case, I will still leave my NNs here and their structures:
my manual NN prediction code(This NN has a batch normalization layer right after the input layer, so I thought I will share how I do that in theano code):
def getWeights(self):
"""
pass self.weights the NN's weights and calculate number of hidden layers
"""
weights = self.model.get_weights()
if 'batch_normalization' in self.model.layers[1].get_config()['name']:
self.bn_weights = weights[:4]
self.dense_weights = weights[4:]
else:
self.bn_weights = None
self.dense_weights = weights
self.no_hidden_dense_layers = len(self.dense_weights)/2-1
def manualPredict(self, inputs):
"""
"Manual calculation" of a NN done with theano tensors, for pymc3 to use
"""
xx = T.transpose(inputs)
if self.bn_weights is not None:
xx=T.nnet.bn.batch_normalization_test(xx,*self.bn_weights,epsilon=0.001)
for i in np.arange(self.no_hidden_dense_layers)*2:
i=int(i)
xx=T.nnet.elu(pm.math.dot(xx,self.dense_weights[i])+self.dense_weights[i+1])
xx=(T.dot(xx,self.dense_weights[-2])+self.dense_weights[-1])
return xx.T
Do ask if there is anything not clear about my notations
Oh boy. Can you try one last run ...
Can you set all the extinction values to 0.11 (fixed and no spread). If it can do that then ... Well who knows.
I'll try and use your Nn's next week to run some checks.
On Sat, 5 Sep 2020, 10:14 HinLeung622, notifications@github.com wrote:
@grd349 https://github.com/grd349 Results for fixed max pool all fundamentals, partial pool Av with fixed mean, fixed dist_mod and fixed noise model: (meaning only mass and spread Av are allowed to vary) [image: image] https://user-images.githubusercontent.com/56083013/92301478-cd7ee600-ef96-11ea-95e7-b765458637a8.png [image: image] https://user-images.githubusercontent.com/56083013/92301480-d079d680-ef96-11ea-983b-894ba9f4e150.png [image: image] https://user-images.githubusercontent.com/56083013/92301532-349c9a80-ef97-11ea-8cb4-2ae91d0e76d0.png [image: image] https://user-images.githubusercontent.com/56083013/92301489-d8d21180-ef96-11ea-97db-5f0207eaaa07.png This is quite an absolute disaster. I suspect it might have something to do with the very small observational errors on the observed mags we gave the sampler (0.001 for all three bands). Note: the third plot (CMD with both data and HBM guess) uses 0.001 mags to plot its error bars on the blue dots. They are not visible in this plot.
Another thing to note is that the estimated mean spread Av resulted to be 0.07, which is quite far from the peak of the original prior of pm.Lognormal('spread_A_V', T.log(0.01), 0.1).
I will try reruning the HBM but with a higher obs error for the magnitudes, maybe something like 0.01, but I doubt it will help much, if any.
In any case, I will still leave my NNs here and their structures:
- The fundamental to Teff/luminosity/delnu NN: https://github.com/Harry-Westwood/Y4-Project-InterNeuralStellar/blob/master/Hin's_files/b7_best_model.h5 Input cols = [log10 M, log10 star age, initial feh, log10 initial Y, log10 initial alpha_MLT] output cols = [log10 R, log10 (Teff/5000), log10 delnu, star feh]
my manual NN prediction code(This NN has a batch normalization layer right after the input layer, so I thought I will share how I do that in theano code):
def getWeights(self): """ pass self.weights the NN's weights and calculate number of hidden layers """ weights = self.model.get_weights() if 'batch_normalization' in self.model.layers[1].get_config()['name']: self.bn_weights = weights[:4] self.dense_weights = weights[4:] else: self.bn_weights = None self.dense_weights = weights self.no_hidden_dense_layers = len(self.dense_weights)/2-1 def manualPredict(self, inputs): """ "Manual calculation" of a NN done with theano tensors, for pymc3 to use """ xx = T.transpose(inputs) if self.bn_weights is not None: xx=T.nnet.bn.batch_normalization_test(xx,*self.bn_weights,epsilon=0.001) for i in np.arange(self.no_hidden_dense_layers)*2: i=int(i) xx=T.nnet.elu(pm.math.dot(xx,self.dense_weights[i])+self.dense_weights[i+1]) xx=(T.dot(xx,self.dense_weights[-2])+self.dense_weights[-1]) return xx.T
- The Teff/logg/star feh/Av to BCs NN: https://github.com/Harry-Westwood/Y4-Project-InterNeuralStellar/blob/master/Hin's_files/test6.h5 Input cols = [log10 Teff, logg, star feh, Av] output cols = ['Bessell_U','Bessell_B','Bessell_V','Bessell_R','Bessell_I','Gaia_G_MAW','Gaia_BP_MAWb','Gaia_BP_MAWf','Gaia_RP_MAW'] according to the naming in MIST grid. I have been choosing "Gaia_BP_MAWf" over "Gaia_BP_MAWb" for the BC for BP band so far. Note that since the surface feh values outputted by the first NN are not accurate under this diffusion-turned-on grid, I am using initial feh in place of star feh as the input in for this NN in the HBM. The theano NN prediction code is identical to the previous NN.
Do ask if there is anything not clear about my notations
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Harry-Westwood/Y4-Project-InterNeuralStellar/issues/47#issuecomment-687576656, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQZOCPE6HYSHLOXB7HCAP3SEH6QFANCNFSM4QFXMLIA .
@grd349 Results for fixing Av as well (now the only thing that varies is mass): So the same problem as before occurred: the sampler does not give solutions up to the higher mass regions. I solved this previously by changing one of the bands into Bp-Rp where observed values are given.
Thanks @HinLeung622
It looks like there is a significant problem with the higher masses here (any chance this is the prior? is it still Beta(10, 10)?)
We need to fix the mass issue (or perhaps mass is just where the problem is obvious) before we can go back to unfixing all the parameters that are now hard coded.
To make things easier (perhaps) it might be worth letting fly on two different runs. 1 with a cut BP-RP < 0.85 and one with 0.95 < BP-RP < 1.05. Does that make sense?
@grd349 Yes the prior is still in beta(10,10)
So you are suggesting to cut the data into two sections, one medium mass and one high mass? What are you hoping to see from these runs though?
Forget the cut for a second - we'll try that if the following doesn't work.
Can you set the mass prior to beta(1.1, 1.1) and run again? I think that might be the problem?
I think that might be the problem?
Rather I think that might be a problem.
@grd349 Makes sense, I have not thought about the mass prior affecting the sampler's results before. I will redo it with beta(1.1,1.1) then
@grd349 results with beta(1.1,1.1) prior: So yes, the prior was part of the problem, though the red dots still don't seem to go all the way up to the high mass limit
@grd349 I tried relaxing the fundamentals' max pool inputs into non-fixed values: An absolute trainwreck
@HinLeung622 Yeah there is something wrong. I'm not sure what it is. I think the least tested section of the model is the 2nd neural network. Do you think you are able to code up the Gaia color corrections and extinctions using the functions given in the Bossini paper?
@grd349 Ok I will do that
@grd349 So the current model looks like this: I put 0 into the Av slot so the output BCs should not be affected by any extinction. I then use the coefficients in D Bossini's paper to convert Av into Ag, Abp and Arp, which are then deducted from the magnitude calculation equation:
#second NN calculation
BCs = pm.Deterministic('BCs', t1.manualPredict(T.as_tensor_variable([T.log10(Teff), logg, feh, T.ones(N)*0])))
#BCs
BCg = pm.Deterministic('BCg', BCs[5,:])
BCbp = pm.Deterministic('BCbp', BCs[7,:])
BCrp = pm.Deterministic('BCrp', BCs[8,:])
true_Bp_Rp = BCrp-BCbp
#distance and parallax
dist_mod_ = 9.726
#true observables
true_mG = pm.Deterministic('true_mG', -2.5*T.log10(L)+Mbol-BCg+dist_mod_-getExtinction(A_V, c_G, true_Bp_Rp))
true_Bp = pm.Deterministic('true_Bp', -2.5*T.log10(L)+Mbol-BCbp+dist_mod_-getExtinction(A_V, c_Bp, true_Bp_Rp))
true_Rp = pm.Deterministic('true_Rp', -2.5*T.log10(L)+Mbol-BCrp+dist_mod_-getExtinction(A_V, c_Rp, true_Bp_Rp))
Note that I am using extinction-corrected Bp-Rp as inputs to the extinction conversion. I am uncertain is this correct or should I have to correct it for extinction first as well.
Results: I don't see a major difference between the results of this method and the NN incorporated one. Though the way that the HBM guesses seem to bend counter-clockwise when compared to the singles track does have some similarities to the older comparison plot I made between the two extinction calculation methods in issue #46 (figure 1, the blue track (same extinction calculation as what is implemented in this HBM run) looks to be rotated counter-clockwise compared to the red track (the old way of calculating extinction entirely through the MIST NN)). Not sure what to make about this.
@grd349 This is a pdf of a student's t distribution with sigma = g_mag_err*100, mu=0, nu=1.0: As you can see, even at x100 error and nu=1.0, the distribution is still very narrow. It doesn't look that much different from a normal distribution. (the target width of the tails should be roughly at 0.7 mags since that is high much higher in mags the higher binary track is) I certainly could do a higher multiple, eg. x1000: But x1000 starts getting unrealistic as an observational uncertainty for the Gaia bands. Any thoughts?
Nu = 0.5 or 0.1?
On Wed, 16 Sep 2020, 16:56 HinLeung622, notifications@github.com wrote:
@grd349 https://github.com/grd349 This is a pdf of a student's t distribution with sigma = g_mag_err*100, mu=0, nu=1.0: [image: image] https://user-images.githubusercontent.com/56083013/93361670-b504aa80-f877-11ea-8dae-5d95e18bb1e7.png As you can see, even at x100 error and nu=1.0, the distribution is still very narrow. It doesn't look that much different from a normal distribution. (the target width of the tails should be roughly at 0.7 mags since that is high much higher in mags the higher binary track is) I certainly could do a higher multiple, eg. x1000: [image: image] https://user-images.githubusercontent.com/56083013/93362043-1a589b80-f878-11ea-814b-7015e2f97bdc.png But x1000 starts getting unrealistic as an observational uncertainty for the Gaia bands. Any thoughts?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Harry-Westwood/Y4-Project-InterNeuralStellar/issues/47#issuecomment-693500200, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQZOCIA6KOMI536CWXNULLSGDNZ5ANCNFSM4QFXMLIA .
@grd349 reading about student's T distribution, i believe nu represents the degree of freedom, so in theory it cannot go under 1? Or should I only think about that in a case where there is physical correlation, and this is not such a case.
Nu > 0 is the only constraint! :)
On Wed, 16 Sep 2020, 17:15 HinLeung622, notifications@github.com wrote:
@grd349 https://github.com/grd349 reading about student's T distribution, i believe nu represents the degree of freedom, so in theory it cannot go under 1? Or should I only think about that in a case where there is physical correlation, and this is not such a case.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Harry-Westwood/Y4-Project-InterNeuralStellar/issues/47#issuecomment-693511535, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQZOCMVYVXZDGOOYAEUU33SGDQBHANCNFSM4QFXMLIA .
@grd349 pdfs of student's t distributions with various nu values:
The same pdfs but the y axis is y/max(y) Yeah I still don't think it will help the sampler doing the binaries. Am I doing something wrong here?
@grd349 By the way, results for the binary-removed data, with everything fixed, max pool, and only mass left to vary: Rhats are basically perfect, sampler finished sampling in a mere 37 minutes (compared to the 4+ hours in the runs with noise models) So this basically confirms that it was the noise model that was creating the problem.
Why is the red posterior predictions offset from the data?
On Wed, 16 Sep 2020, 17:55 HinLeung622, notifications@github.com wrote:
@grd349 https://github.com/grd349 By the way, results for the binary-removed data, with everything fixed, max pool, and only mass left to vary: [image: image] https://user-images.githubusercontent.com/56083013/93368208-3c561c00-f880-11ea-8592-732d26cb02c1.png Rhats are basically perfect, sampler finished sampling in a mere 37 minutes (compared to the 4+ hours in the runs with noise models) [image: image] https://user-images.githubusercontent.com/56083013/93368364-79baa980-f880-11ea-937d-9e74bcb19097.png So this basically confirms that it was the noise model that was creating the problem.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Harry-Westwood/Y4-Project-InterNeuralStellar/issues/47#issuecomment-693533854, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQZOCP6F62D4Q4AIN7F3ETSGDUZ3ANCNFSM4QFXMLIA .
@grd349 Probably because of the fixed values I gave it, this is how a constructed cluster with the values i gave it (age = 4.0, feh = 0.05, Yinit = 0.255, MLT = 2.1, dist_mod = 9.72, Av = 0.11)
@grd349 In any case, I went ahead with nu = 1 and x10 observational mag errors, with fixing and max pooling everything except mass: Rhats look good, and the result looks kind of sensible. Though one would expect some of the red dots to appear closer to the binaries under the student's t distribution, as a break away from the MS.
However, something interesting is to be seen in plotting the distributions:
Maybe this is happening since i gave all three bands student's t distributions, and due to the (about 5x) wider gaia observational error given in the data for Bp and Rp bands, the sampler is guided towards modelling the binaries in the Bp-Rp axis. Thoughts?
Also, this run took an astronomical amount of time to sample, about 23 hours, which is 40x the time needed without the binary stars and with a basically identical model.
@HinLeung622 Excellent work! So now we know that the noise model is the problem. There are a number of interesting points that you make!
So let's think about what minimum required performance of the noise model is ...
The noise model should be sensitive to the single stars. It should be insensitive to the binary stars. It should have some way to transitioning from the single stars to the binary stars where Delta_G = 0.4ish. How about the following ...
Let's only consider the binary effect in the g band! What if we student-t the g_mag only. Set nu to be 0.1 or similar. Multiply the Gaia G-band uncertainty by 10 times. If you now let mass and distance (distance mod?) vary and see how you go? Does that sound sensible?
@grd349 Thanks. That makes sense. Currently I am doing Bp and Rp bands separately. Should I revert back to calculating the true_Bp_Rp before using that to be constrainted by the data's Bp-Rp? (So it will be 1 band and 1 colour) Or, should I keep going with the 3 bands, but use normal distributions for Bp and Rp, and not enlarge their obs errors?
Also, if you look at the two plots of different nu values of student t distributions in the reply 2 days ago, it seems that the distribution only gets more "sharp" when nu is further decreased from nu=1 downwards. I am slightly sceptical of the effects of lowering nu to 0.1.
@HinLeung622 - What would be the extra overhead for including more bands in our analysis? I thinking particularly the HJK bands from 2MASS. Did you already train your neural network on these bands?
@grd349 Not sure about the overheads, perhaps it will be linear at worst? I have not trained the NN on the 2MASS bands, but doing so will likely only take one day.
Currently I am doing Bp and Rp bands separately. Should I revert back to calculating the true_Bp_Rp before using that to be constrainted by the data's Bp-Rp?
My feeling is that we should be treating the bands separately and perhaps even adding more bands but maybe this is too much additional work!
OK - don't worry about the additional bands - I'll things through about additional bands. Maybe that's for when we have pymc4 working and the run time drops below an hour :)
@grd349
My feeling is that we should be treating the bands separately and perhaps even adding more bands but maybe this is too much additional work!
I am not concerned about the work per se, but rather the possibility that doing one band one colour would help the sampler to behave and only model for the binaries in g band.
Your the boss @HinLeung622 - I trust you. You choose :)
@grd349 Ok, here is what I will do: max pool and fix everything except mass and dist mod student T's distribution with sigma = data obs err x10 on g band only, and normal distributions on Bp and Rp bands, so it will be 3 bands instead of one band one colour, for now. If I don't get promising results from doing 3 bands separately, I will return to one band one colour since that showed slightly more promise in my memory previously.
As for what nu value to use in the student's T distribution, can you take a look at the two plots of different nu values of student t distributions I sent in the reply 2 days ago? it seems that the distribution only gets more "sharp" (suppressed tails, in the second plot) when nu is further decreased from nu=1 downwards. I am slightly sceptical of the effects of lowering nu to 0.1.
Hope all of this is sensible decisions.
Agree - stick with nu=1as I tried sampling with 0.1 and got very poor results.
On Fri, 18 Sep 2020, 10:31 HinLeung622, notifications@github.com wrote:
@grd349 https://github.com/grd349 Ok, here is what I will do: max pool and fix everything except mass and dist mod student T's distribution with sigma = data obs err x10 on g band only, and normal distributions on Bp and Rp bands, so it will be 3 bands instead of one band one colour, for now. If I don't get promising results from doing 3 bands separately, I will return to one band one colour since that showed slightly more promise in my memory previously.
As for what nu value to use in the student's T distribution, can you take a look at the two plots of different nu values of student t distributions I sent in the reply 2 days ago? it seems that the distribution only gets more "sharp" (suppressed tails, in the second plot) when nu is further decreased from nu=1 downwards. I am slightly sceptical of the effects of lowering nu to 0.1.
Hope all of this is sensible decisions.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Harry-Westwood/Y4-Project-InterNeuralStellar/issues/47#issuecomment-694762862, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQZOCOCA37FZZ2DSP4TL3DSGMSG7ANCNFSM4QFXMLIA .
@grd349 Ok, then this is and update to the next run's plan: -max pool and fix everything except mass and dist mod -on g band, student's t distribution, nu=1, sigma = 10*data_obs_g_mag_err -on bp and rp bands, normal distributions with no multiplication on the data errors -using the last mass-only run as start points for the masses in this run -I have shifted the cluster-wide fundamentals a bit so the OC it predicts better agree with the data: feh 0.05->0.1, Av 0.11->0.15. (Note: increasing helium to solar-level only pushes the predicted track left (lower bp-rp) and further away from the data)
@grd349 Random thought: since we are having troubles to get a student's t distribution to be wide enough to include the binaries' higher g mags, would combining two normal distributions, one with a small sigma and one with a much much wider sigma but with lower weight be a choice? Though now that I think about it, it sounds just like a simplified version of the noise model we had before.
I think the mixture of two Guassians would give you more flexibility - try both? You have to be careful when you setup the mixture of two Guassians though.
Dr Guy R. Davies PI - ERC CartographY Project Senior Lecturer in Astrophysics School of Physics and Astronomy The University of Birmingham Edgbaston Birmingham B15 2TT
Tel +44 (0) 121 414 4597 G.R.Davies@bham.ac.uk grd349@gmail.com davies@bison.ph.bham.ac.u davies@bison.ph.bham.ac.ukk davies@bison.ph.bham.ac.uk
The contents of this e-mail may be privileged and are confidential. It may not be disclosed, used, or copied in any way by anyone other than the addressee. If received in error please notify the sender then delete it from your system. Should you communicate with the sender by e-mail, you consent to The University of Birmingham monitoring and reading any such correspondence.
On Fri, 18 Sep 2020 at 10:48, HinLeung622 notifications@github.com wrote:
@grd349 https://github.com/grd349 Random thought: since we are having troubles to get a student's t distribution to be wide enough to include the binaries' higher g mags, would combining two normal distributions, one with a small sigma and one with a much much wider sigma but with lower weight be a choice? Though now that I think about it, it sounds just like a simplified version of the noise model we had before.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Harry-Westwood/Y4-Project-InterNeuralStellar/issues/47#issuecomment-694770883, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQZOCOKCBVZ5P2JZPHCPYLSGMUGNANCNFSM4QFXMLIA .
@grd349 After a lot of testing during the weekend, I would say this looks pretty promising:
The student's t distribution based model took more than 48 hours to sample for 2000 tuning steps and ran out of time. I don't think that is a viable path to continue on.
However, I have achieved (limited so far) success with a double overlaping gaussian distribution model, one sharp and centred at 0 to account for the MS and one much flatter to account for the binaries.
The model:
#all fundamentals max pool and priors fixed, this only shows after the first NN
#extinction prior
A_V_mu = 0.15
A_V_std = pm.Lognormal('spread_A_V', T.log(0.01), 0.1) # Different X & Y but still sensible
A_V_ = pm.Normal('A_V_', 0, 1, shape=N) # an N(0, 1) dist to be scaled
A_V = pm.Deterministic('A_V', A_V_ * A_V_std + A_V_mu)
#second NN calculation
BCs = pm.Deterministic('BCs', t1.manualPredict(T.as_tensor_variable([T.log10(Teff), logg, feh, A_V])))
#BCs
BCg = pm.Deterministic('BCg', BCs[5,:])
BCbp = pm.Deterministic('BCbp', BCs[7,:])
BCrp = pm.Deterministic('BCrp', BCs[8,:])
#distance and parallax
dist_mod_ = pm.Normal('dist_mod', dist_mod, 1.0)
true_parallax = pm.Deterministic('true_parallax', 10**(-dist_mod_/5-1)*1000)
obs_parallax = pm.Normal('obs_parallax', true_parallax, M67_parallax[1], observed=M67_parallax[0])
#true observables
true_mG = pm.Deterministic('true_mG', -2.5*T.log10(L)+Mbol-BCg+dist_mod_)
true_Bp_Rp = pm.Deterministic('true_Bp_Rp', BCrp - BCbp)
#dealing with multiples
q = pm.Beta('q', 1.1,1.1, testval=0.8)
delta_mG = pm.Lognormal('delta_mG', T.log(0.3), 0.4)
f = pm.Lognormal('f', T.log(10),0.7)
sigma_multiple = pm.Lognormal('sigma_multiple', T.log(0.3), 0.3)
dist_singular = pm.Normal.dist(0, np.mean(M67['g_mag_err'])*f)
dist_multiple = pm.Normal.dist(-delta_mG, sigma_multiple)
#obs observables
obs_mG = pm.Mixture('obs_mG', w=[q, 1-q], comp_dists = [dist_singular, dist_multiple], \
observed=M67['g_mag'].values - true_mG)
obs_Bp_Rp = pm.Normal('obs_Bp_Rp', mu=true_Bp_Rp, sigma=M67['Bp_Rp_err'], observed=M67['Bp_Rp'])
q is the fraction of single stars, delta_mG is how much to shift the G mag of the flat distribution (dist_multiple) from centralized at 0, f is the multiple to observational g mag error and sigma_multiple is the width of the flat distribution. This current edition has a partial pool Av but the mean Av is still fixed.
Next step is to let mean Av vary as well, hope it reaches similarly good results.
@HinLeung622 - Perfect! Great work on the two Gaussians for the noise model. That looks much much better and presumably works much better than the studentsT dist.
I think you have something there - what parameter are you going to free up next? I think age could be next, then metallicity, then helium. Does that sound sensible?
@grd349 I freed up the mean Av first, it is running right now. Also, intuitively, I thought i would do it in an order that is opposite to yours: MLT -> helium -> feh -> age, intuitively, ordered from smallest to largest effect each parameter has on the observables. Are you thinking of the opposite? (ie from large to small)
I was thinking the sampler would prefer large to small but I’m not convinced it matters too much. It’s possible that the A_v (differential) will be confused with the single star noise spread which might make things harder - but I’ve been wrong about these things before :)
G
On Mon, 21 Sep 2020 at 12:12, HinLeung622 notifications@github.com wrote:
@grd349 https://github.com/grd349
I freed up the mean Av first, it is running right now. Also, intuitively, I thought i would do it in an order that is opposite to yours: MLT -> helium -> feh -> age, intuitively, ordered from smallest to largest effect each parameter has on the observables. Are you thinking of the opposite? (ie from large to small)
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Harry-Westwood/Y4-Project-InterNeuralStellar/issues/47#issuecomment-696049146, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQZOCIYANEE2ELQ56XU5IDSG4YJVANCNFSM4QFXMLIA .
-- Dr Guy R. Davies PI - ERC CartographY Project Senior Lecturer in Astrophysics School of Physics and Astronomy The University of Birmingham Edgbaston Birmingham B15 2TT
Tel +44 (0) 121 414 4597 G.R.Davies@bham.ac.uk grd349@gmail.com davies@bison.ph.bham.ac.u davies@bison.ph.bham.ac.ukk davies@bison.ph.bham.ac.uk
The contents of this e-mail may be privileged and are confidential. It may not be disclosed, used, or copied in any way by anyone other than the addressee. If received in error please notify the sender then delete it from your system. Should you communicate with the sender by e-mail, you consent to The University of Birmingham monitoring and reading any such correspondence.
@grd349 If you mean the observational errors in individual stars when you say "single star noise spread" above, then that is exactly what I believe happened with the previous, more complex, noise model.
I agree that the sampler would have a better time sampling large-changing-parameters, but on the other hand it might be hard to tune the smallest-changing-parameter when everything has been allowed to vary already. I don't know, just my intuitive thought.
See what works - you never really know with this stuff till you have tried it.
On Mon, 21 Sep 2020 at 12:20, HinLeung622 notifications@github.com wrote:
@grd349 https://github.com/grd349
If you mean the observational errors in individual stars when you say "single star noise spread" above, then that is exactly what I believe happened with the previous, more complex, noise model.
I agree that the sampler would have a better time sampling large-changing-parameters, but on the other hand it might be hard to tune the smallest-changing-parameter when everything has been allowed to vary already. I don't know, just my intuitive thought.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Harry-Westwood/Y4-Project-InterNeuralStellar/issues/47#issuecomment-696052368, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQZOCJSVMZZTD5V2A6DS4TSG4ZGRANCNFSM4QFXMLIA .
-- Dr Guy R. Davies PI - ERC CartographY Project Senior Lecturer in Astrophysics School of Physics and Astronomy The University of Birmingham Edgbaston Birmingham B15 2TT
Tel +44 (0) 121 414 4597 G.R.Davies@bham.ac.uk grd349@gmail.com davies@bison.ph.bham.ac.u davies@bison.ph.bham.ac.ukk davies@bison.ph.bham.ac.uk
The contents of this e-mail may be privileged and are confidential. It may not be disclosed, used, or copied in any way by anyone other than the addressee. If received in error please notify the sender then delete it from your system. Should you communicate with the sender by e-mail, you consent to The University of Birmingham monitoring and reading any such correspondence.
@grd349 This is a continuation of issue #45. A new thread is made since the last one is getting ridiculously long and it takes ages to load in all the images.
State of the HBM: data = M67 low mass dwarfs with binaries/double stars included (total = 354 stars)
I tried making a hand-drawn detailed PGM of the current state of the HBM (pencil = NN conversions)
full model:
Currently taking 10x original GAIA magnitude errors as our obs errors