psteinb / covid19-curve-your-city

Extrapolation der COVID19 Fallzahlen
BSD 3-Clause "New" or "Revised" License
9 stars 5 forks source link

introduce goodness of fit #3

Closed psteinb closed 4 years ago

psteinb commented 4 years ago

expand the core script to produce a goodness of fit figure or plot. Chi2/ndf should do.

tkphd commented 4 years ago

I've introduced $\chi^{2}_{\nu} = \chi^{2}/(N - 2)$ as a goodness-of-fit indicator on my latest graph, but its value is 1.7, and I believe a "true" GOF should asymptotically approach 1. Would be interested in your thoughts.

chisq, chip = chisquare(y, model(t, a, b))
ndof = N - 2
reduced_chisq = chisq / ndof

This produces a scalar value. From your mention above of a plot/figure, should chisq be computed for each point?

psteinb commented 4 years ago

Chi2 is the sum of residuals over the entire set of (x,y) values. So it produces a scalar value. That is correct. The number of degrees of freedom in your case is off by one. ndf = N(data) - N(params) - 1 Not sure if that changes the result too much. Note also, that least square fitting is guaranteed to converge with datasets that are "large" so that the normality constraint is met. If you ask statisticians what large is, many are likely to respond: something above 20 data points. There is no discrete boundary due to the convergence limit - which is a mathematical limit.

tkphd commented 4 years ago

Updated in the code, thanks! I hope we have a vaccine before the dataset becomes "large."

psteinb commented 4 years ago

The landing page contains a discussion of the goodness-of-fit using chi2/ndf