Closed Gabriel-p closed 7 years ago
A simpler approach would be to add a weight to stars based on their (main) brightness. This weight should affect both observed and synthetic stars, so that when multiplied by the likelihood brighter stars will contribute more than low mass stars.
For example, the sequence could be divided into three sections S1, S2, S3
according to the maximum and minimum brightness:
step = (min_mag - max_mag) / 3
S1: [max_mag, max_mag + step]
S2: (max_mag + step, max_mag + 2 * step]
S3: (max_mag + 2 * step, max_mag + 3 * step] = (max_mag + 2 * step, min_mag]
The weight of each section could be divided into three options plus a "don't affect the likelihood" option:
no | low | mid | high
S1 1. | 1. | 1. | 1.
S2 1. | 0.9 | 0.75 | 0.5
S3 1. | 0.75 | 0.5 | 0.
The no
option makes this function not have any effect, the low
option makes it have a small effect on the likelihood and the high
makes it have a large effect.
Even simpler, each n-dimensional bin could be weighted by the inverse mean magnitude (using the main magnitude) in the Dolphin likelihood equation:
sum [(1/V_i) * (m_i - n_i + n_i * ln(n_i/m_i))]
This could also resolve bad fits like this one:
where the Dolphin likelihood chooses a very bad fit with a young synthetic cluster (SC) because there are no low mass stars left in the OC after the DA. Although this SC is a very bad fit, it gives a lower likelihood (L_min~585
) than a better by-eye fit (L_min~878
) like the one below:
This is because the likelihood is obtained as:
L = M - \sum[n_i * log(m_i)]
which means that a bin with no observed stars (n_i=0
) bu with a lot of synthetic stars (m_i=a lot
) has the effect of increasing L
through M
. Thus, the GA will try to avoid putting anything in the low mass portion of the CMD since n_i
being zero, it would only increase the (inverted) likelihood.
A simpler and more reasonable approach seem to be this one:
Wrongly closed here: https://github.com/asteca/ASteCA/commit/c8dc03c85ee1b2dcb19b49814ea914181f71caba
Currently the likelihood equations based on binned statistics, will sometimes overlook the presence of bright stars with high MPs.
This is because the number of low mass stars is always larger, and also because membership probabilities are not taken into account during the likelihood minimization process (the Tolstoy likelihood does take MPS into account). This can make the fit biased towards adjusting the synthetic cluster to the bulge of low mass stars, disregarding the potentially (almost always) more important bright stars. Example (L72):
I could add a prior-like equation to multiply the likelihood. It should have the form:
where
p
is the prior, anddist_coef
is a coefficient obtained calculating the average distance of the brightest stars (e.g.: the top half brightest) to the synthetic cluster's stars (or some other measure?).This way a large
dist_coef
meansp
will be small and thus thep*likelihood
term will be diminished.Not sure how statistically correct this approach is though.
Add:
Perhaps instead of a
p
coefficient like the one above, use the distance to the isochrone. This avoids the issue of better fits for synthetic clusters where the mass is increased artificially to generate more bright stars.See: http://stackoverflow.com/a/19869154/1391441