Dealing with two errors

lmfit / uncertainties

Transparent calculations with uncertainties on the quantities involved (aka "error propagation"); calculation of derivatives.

http://uncertainties.readthedocs.io/

Other

576 stars 74 forks source link

Dealing with two errors #16

Closed mverzett closed 11 years ago

mverzett commented 11 years ago

Dear lebigot,

first of all let me thank you for such an awesome package. I have a question: I am dealing with numbers that have TWO errors, one is statistical, the other is the so called "systematic" and is considered good practice to deal with them separately. Something like: 1.0 +/- 0.1 +/- 0.05 The two errors are considered to be totally uncorrelated.

Do you have any suggestion on how to treat this problem with your package? I tried looking up in the documentation but I could not find anything useful

Thank you

lebigot commented 11 years ago

Thank you! I am glad that my package is being put to good use!

Since your two kinds of errors are completely uncorrelated, you can simply do:

>>> from uncertainties import ufloat
>>> stat_error = ufloat((0, 0.1))
>>> syst_error = ufloat((0, 0.05))
>>> value = 1 + stat_error + syst_error
>>> print value
1.0+/-0.111803398875

or more concisely:

>>> value = ufloat((1, 0.1)) + ufloat((0, 0.05))  # Statistical measurement + systematic error

The main idea is that the errors are independent random variables with average value 0: you just create two independent random variables with ufloat().

Multiple measurements can even share the same statistical or systematic error, in which case their uncertainties will automatically be correlated by the uncertainties package.

jbwhit commented 11 years ago

I'm going to interject here with some extra thoughts.

lebigot, you wrote that these errors are "independent random variables with average value 0" -- which I think is a key distinction from the situation that's being asked about here. Statistical error will be "independent random variables with average value of 0" -- while systematic errors are NOT, in general, correctly described in that way. Specifically, one of the nastier features is that they can have an average value != 0.

So, to address the initial question -- you are right to address them separately, as the correct way of handling any particular systematic error depends very much on the particulars of the situation. Although I don't have any clever ideas on how to handle this situation within the uncertainties package.

lebigot commented 11 years ago

True. I was just interpreting the initial post as meaning that there was a value 1 plus two errors centered in 0.

The uncertainties package handles a systematic error which is not centered in zero by just giving it a non-zero nominal value:

>>> from uncertainties import ufloat
>>> stat_error = ufloat((0, 0.1))
>>> syst_error = ufloat((0.5, 0.05))  # Systematic error not centered in zero
>>> value = 1 + stat_error + syst_error

mverzett commented 11 years ago

Ok,

my question was more how to propagate them keeping them separated. So that when adding, multiplying etc. two numbers i get the correct propagation for the two errors separately.

Talking to some colleagues I got the suggestion to do like this:

number1 = central_value1*ufloat((1,relative_stat_err),'stat_err')*ufloat((1,relative_sys_err),'sys_err')
…
…
result_central = result.nominal_value
result_stat       = quad(
        *[ j for i, j in result.error_components().iteritems() if 'stat' in i.tag]
        )
result_sys        = quad(
        *[ j for i, j in result.error_components().iteritems() if 'sys' in i.tag]
        )

where quad is a simple function to sum in quadrature

lebigot commented 11 years ago

That's almost perfect. :) Note that it would be more robust and faster to have the non-systematic uncertainty be calculated as result_stat = math.sqrt(result.std_dev()**2 - result_sys**2), so that you are guaranteed to never miss any uncertainty contribution: otherwise, the code might be updated one day with tags that do not fall in the 'stat'/'sys' dichotomy (maybe one of the variables will have no tag) and the corresponding uncertainties might be forgotten.

This is a very good use of the tag optional argument to ufloat()!

PS: I added details in the documentation about the possibility of using the same tag for different variables. Thank you for your input! Here is the page: http://pythonhosted.org/uncertainties/user_guide.html#access-to-the-individual-sources-of-uncertainty

jbwhit commented 11 years ago

That is a very slick implementation! I suspect I'll be using this in the very near future myself.