andycasey / smhr

Spectroscopy Made Hard(er)
14 stars 6 forks source link

Equivalent width uncertainties #122

Open alexji opened 8 years ago

alexji commented 8 years ago

When the equivalent width uncertainties are large, the uncertainty changes significantly every time you refit the same line. It's especially bad for degree 2 or higher continuum polynomials. I think this is due to insufficient samples from the covariance posterior?

andycasey commented 8 years ago

This is probably the correct behaviour, but it might be worth providing an example for me to check.

We're just optimizing the model (profile) parameters so we don't have a formal posterior on any parameter. The covariance matrix is estimated at fitting time by multiplying the residual variance of the fit with the Jacobian, so it's a good estimate of the formal fitting errors but is probably underestimated. If the EW is very large and the EW uncertainty is ~10-20% then the uncertainty value will change a lot with repeated fits. And with a high degree polynomial there's more flexibility to account for crazy things in the data (e.g., nearby strong lines) so I could imagine it gives large uncertainties until things are well-fit.

I don't know if any of that helps explain the particular example you saw. Is it worth posting an example of a line?

alexji commented 8 years ago

Yes, I think it happens when the EW uncertainty becomes O(10%) though I did not test that rigorously.

Here's an example, all I did here was push "Fit One" twice (i.e. call spectral_model.fit() twice). The uncertainty changes between 6 mA to 8.5 mA (there is a small visual effect in the plot as well). This is large enough that it could affect weighted stellar parameter fits depending on what value you draw.

screen shot 2016-06-14 at 11 05 13 pm screen shot 2016-06-14 at 11 05 05 pm
andycasey commented 8 years ago

Interesting!

Can you save the session (or just the normalized spectrum) and post a link here so I can investigate?

alexji commented 8 years ago

Here's a link: https://www.dropbox.com/s/q7ycaiouii05n9b/test.smh?dl=0

Fitting some more, it turns out the numbers I said turn out to be the two extremes. Most of the time it gets 7.something.

alexji commented 7 years ago

@andycasey this is still happening a lot. The reason appears to be that the EW uncertainty is estimated by taking 100 samples from the covariance matrix and integrating 100 times. It is also actually calculating a 2 sigma (95 percentile) error by default, rather than a 1 sigma (68 percentile). When the fit uncertainty is large, this is quite unstable.

I think it's probably more correct to calculate a 68 percentile when doing weighted minimization of the stellar parameter slopes, right? That should also improve the stability. We can also create a session setting for the number of covariance draws. This will apply to both profile and synthesis models.

[This is a separate issue from having too-small fitting error when fitting strong lines in blended regions; that still exists and is a bigger problem for high S/N data.]

alexji commented 7 years ago

I have made it so the EW uncertainty is 68 percentile and that you can modify this and the number of draws in the smh defaults file. However, for low S/N (~20-30), I have found that you need REALLY large numbers (>1e6) of draws in order to get an EW to within ~0.1 mA.

I'm not totally sure what to do here; maybe this is good enough for the interactive part, and final errors can be reported with a full covariance sampling? Or perhaps it's better to use a deterministic formula to calculate this even if it slightly underestimates the error during the interactive part?

andycasey commented 7 years ago

OK with 68th percentile.

Do you think it’s worthwhile having a session option to use a deterministic/empiriical formulae to approximate the error? We’d have to include metadata for the spectral model to say how the error was estimated (since the user could chose to switch to a deterministic approach half way through a session).

alexji commented 7 years ago

That sounds like a good idea.

alexji commented 7 years ago

After some more searching, I have found that Chris Churchhill's thesis gives a good description of how to calculate this analytically. http://astronomy.nmsu.edu/cwc/Research/thesis/Pages/ch2.html In particular the appendix of Sembach and Savage 1992 gives the best treatment of EQW errors I have yet seen. http://adsabs.harvard.edu/abs/1992ApJS...83..147S

Since we calculate the EQW by integrating an analytic profile, it might be best to redo the calculation and implement that. At least for fitting profiles, I think this will be much better than randomly sampling the covariance matrix.

One thing is that we should consider moving from a simple polynomial basis to a Legendre (or other finite-domain) polynomial basis for the continuum, both for profiles and synthesis. The main benefit is this gives better coefficient stability as you increase the order, and probably better covariances as well. The more fancy numpy polynomial fitting does this automatically.