terminological / uk-covid-datatools

Data tools for loading and processing covid data
MIT License
2 stars 2 forks source link

Conversion between sdlog and sd for lognormals #4

Open kdpenner opened 3 years ago

kdpenner commented 3 years ago

For a lognormal distribution my understanding is:

mean = exp(meanlog + 0.5 sdlog^2) var = exp(2 meanlog + sdlog^2) * (exp(sdlog^2) - 1)

Some of the conversions are off, unless I misunderstand R's definition of sd. Take "infection to test" from supplemental table 4:

meanlog = 1.68 sdlog = 0.92 mean = 8.19 sd = 7.59

mean is spot on but sd should be ~9.45.

Unless you estimate the parameters from the bootstraps?

robchallen commented 3 years ago

Thanks for this. Good spot.

I had a exp(sdlog^2 - 1) rather than a (exp(sdlog^2) - 1) term in the conversion.

I think that will only affect the SD labels in the paper, as all the calculations are off the meanlog / sdlog so should be isolated error, but I'll obviously check and update the medrxiv paper.

kdpenner commented 3 years ago

Might be useful to mention that the rate, shape, mean, and sd parameters in fig 4 are estimated using means of parameters from the bootstraps---mean and sd don't necessarily follow from rate and shape.

robchallen commented 3 years ago

Thanks. I will think about how to make that clear when I update with reviews. Its a little tricky as all the parameters are interdependent therefore a distribution that has parameters which are mean of shape and mean of rate, is not necessarily representative of the whole set of bootstraps. - the mean of mean is the only one that you can pin down with a degree of statistical rigour. I think the other important thing is that the distributions of the parameters are not normal. Mean is as you would expect, but SD, shape and rate are not.