annayqho / TheCannon

a data-driven method for determining stellar parameters and abundances from stellar spectra
MIT License
39 stars 16 forks source link

Coefficient estimate uncertainties: #41

Closed tingyuansen closed 8 years ago

tingyuansen commented 9 years ago

The code estimates the coefficient uncertainties via an analytic formula. I am not sure if that's correct. This analytic formula is true if the measurement uncertainty is drawn from a single normal distribution, independent of the choice of the spectrum. In our case, this assumption does not hold.

If this issue is true, the flux uncertainty inputs are not important since the best estimate of the coefficient does not depend on flux uncertainties. This will greatly simplify the code.

davidwhogg commented 9 years ago

I am not sure what is being asked here. Is the question about propagating the scatter uncertainty? In the limit that s^2 is fixed, the coefficient uncertainties are analytic and exact. It is just least-square fitting with Gaussian errors. Unless there is a bug in the code!

tingyuansen commented 9 years ago

Sorry for dropping the ball for the last 2 days. I am not sure if we are discussing the same question. It is not about the s^2 term, but the sigma^2 + s^2 term. In the training procedure, one derives the best estimate of the coefficients through the Moore-Penrose pseudo-inverse (with MLE). However, the uncertainties of the coefficients are analytic only when sigma^2 + s^2 are drawn from the same distribution, independent of the spectrum. Since this assumption does not hold, there shouldn't be an analytic formula for the uncertainties of the coefficients.

davidwhogg commented 9 years ago

Okay, once again I don't understand. Let's move to a pdf document. Do you agree that given a fixed value of s**2 and all the sigmas, the uncertainty is just the covariance matrix of the linear least-square fit?