Add prior-like weight to likelihood to fit bright stars?

Gabriel-p commented 9 years ago

Currently the likelihood equations based on binned statistics, will sometimes overlook the presence of bright stars with high MPs.

This is because the number of low mass stars is always larger, and also because membership probabilities are not taken into account during the likelihood minimization process (the Tolstoy likelihood does take MPS into account). This can make the fit biased towards adjusting the synthetic cluster to the bulge of low mass stars, disregarding the potentially (almost always) more important bright stars. Example (L72):

captura de pantalla de 2015-08-10 12 52 23

I could add a prior-like equation to multiply the likelihood. It should have the form:

p = 1 / (1 + dist_coef)

where p is the prior, and dist_coef is a coefficient obtained calculating the average distance of the brightest stars (e.g.: the top half brightest) to the synthetic cluster's stars (or some other measure?).

This way a large dist_coef means p will be small and thus the p*likelihood term will be diminished.

Not sure how statistically correct this approach is though.

Add:

Perhaps instead of a p coefficient like the one above, use the distance to the isochrone. This avoids the issue of better fits for synthetic clusters where the mass is increased artificially to generate more bright stars.

See: http://stackoverflow.com/a/19869154/1391441

Gabriel-p commented 9 years ago

A simpler approach would be to add a weight to stars based on their (main) brightness. This weight should affect both observed and synthetic stars, so that when multiplied by the likelihood brighter stars will contribute more than low mass stars.

For example, the sequence could be divided into three sections S1, S2, S3 according to the maximum and minimum brightness:

step = (min_mag - max_mag) / 3
S1: [max_mag, max_mag + step]
S2: (max_mag + step, max_mag + 2 * step]
S3: (max_mag + 2 * step, max_mag + 3 * step] = (max_mag + 2 * step, min_mag]

The weight of each section could be divided into three options plus a "don't affect the likelihood" option:

    no |  low |  mid | high
S1  1. | 1.   | 1.   | 1.
S2  1. | 0.9  | 0.75 | 0.5
S3  1. | 0.75 | 0.5  | 0.

The no option makes this function not have any effect, the low option makes it have a small effect on the likelihood and the high makes it have a large effect.

Even simpler, each n-dimensional bin could be weighted by the inverse mean magnitude (using the main magnitude) in the Dolphin likelihood equation:

sum [(1/V_i) * (m_i - n_i + n_i * ln(n_i/m_i))]

Gabriel-p commented 8 years ago

This could also resolve bad fits like this one:

captura de pantalla de 2016-03-29 15 07 54

where the Dolphin likelihood chooses a very bad fit with a young synthetic cluster (SC) because there are no low mass stars left in the OC after the DA. Although this SC is a very bad fit, it gives a lower likelihood (L_min~585) than a better by-eye fit (L_min~878) like the one below:

captura de pantalla de 2016-03-29 15 19 56

This is because the likelihood is obtained as:

L = M - \sum[n_i * log(m_i)]

which means that a bin with no observed stars (n_i=0) bu with a lot of synthetic stars (m_i=a lot) has the effect of increasing L through M. Thus, the GA will try to avoid putting anything in the low mass portion of the CMD since n_i being zero, it would only increase the (inverted) likelihood.

Gabriel-p commented 7 years ago

A simpler and more reasonable approach seem to be this one:

Don't use brightness, use MPs
Weight each bin in the likelihood equation according to the MPs of the stars within it. Two approaches could be used here:
1. Use the mean/median MP of all stars in the bins
2. Use the largest MP value found within the bin as the weight

Gabriel-p commented 7 years ago

Wrongly closed here: https://github.com/asteca/ASteCA/commit/c8dc03c85ee1b2dcb19b49814ea914181f71caba

asteca / ASteCA

Add prior-like weight to likelihood to fit bright stars? #216