lmfit / uncertainties

Transparent calculations with uncertainties on the quantities involved (aka "error propagation"); calculation of derivatives.
http://uncertainties.readthedocs.io/
Other
582 stars 75 forks source link

[Not an issue] Reflexion on wrapping uncertainties #148

Open mocquin opened 2 years ago

mocquin commented 2 years ago

Hello there,

I started working on a backend interface for my physical quantity package, similar to pint. The general idea of the interface is to make it easier to wrap any kind of numerical-like package (like uncertainties), by keeping most of the "numerical-backend" functionnalities with minimum additional code (and without explicit implementation in my quantity package, the idea being to be as general as possible). Here is an example to wrap uncertainties variables, and get relevant nominal_value:

Property backend interface

Numpy's integration is a great example of how physipy can wrap any kind of numerical values, but it is integrated in the source code of physipy so it's a bit cheating. No source code of physipy makes any explicit reference to uncertainties (I'll call this "non-integrated" package), so we only rely on the general wrapping interface of physipy.

from physipy import m
from physipy import Quantity, Dimension
import uncertainties as u

# create a pure uncertainties instance
x = u.ufloat(0.20, 0.01)  # x = 0.20+/-0.01
print(x)
print(x**2)
print(type(x))
0.200+/-0.010
0.040+/-0.004
<class 'uncertainties.core.Variable'>

If we create a quantity version by mulitplying by 1 meter, the returned value is of Quantity type:

# now let's create a quanity version of x
xq = x*m
print(xq)
print(xq**2)
print(xq+2*m)
print(xq.value)
0.200+/-0.010 m
0.040+/-0.004 m**2
2.200+/-0.010 m
0.200+/-0.010

That's already a pretty neat result that didn't need any additional code. Now uncertainties instance have a nominal_value and std_dev attributes.

# Creation must be done this way and not by "x*m" because "x*m" 
# will multiply the uncerainties variable by 1, and turn it into a
# AffineScalarFunc instance, which is not hashable and breaks my 
# register_property_backend that relies on dict lookup
xq = Quantity(x, Dimension("m")) 

In physipy, if an attribute doesn't exist in the quantity instance, the lookup falls back on the backend value, ie on the uncertainties variable, so by default we get the same result on xq (note that we don't get auto-completion either for the same reason):

print(xq.nominal_value)
print(xq.std_dev)
0.2
0.01

It would be great that xq.nominal_value actually prints 0.2 m, not loosing the unit and making it explicit that the nominal value is actually 0.2 meters. To do that, we can add a property backend to specify what we want xq.nominal_value to return : a property backend is a dictionnary with key the name of the attribute, and value the corresponding method to get the wanted result. For the nominal_value and standard deviation, we just want to add back the unit and make the variable a Quantity, so we multiply by the corresponding SI unit:

from physipy.quantity.quantity import register_property_backend
uncertainties_property_backend_interface = {
    # res is the backend result of the attribute lookup, and q the wrapping quantity
    "nominal_value":lambda q, res: q._SI_unitary_quantity*res,
    "std_dev":lambda q, res: q._SI_unitary_quantity*res,
}

register_property_backend(type(xq.value), 
                         uncertainties_property_backend_interface)

With this property back interface registered we get the desired result for print(xq.nominal_value):

print(xq.nominal_value)
0.2 m

Now this approach allows wrapping backend attributes with minimal code, but it probably can't cover all use cases. Also, the fact that AffineScalarFunc is not hashable would imply to change the way the lookup is made (maybe with a string version of the class like 'uncertainties.core.AffineScalarFunc' and not the actual class uncertainties.core.AffineScalarFunc).

Why the duck type approach doesn't work

Another approach to do this would be to create a new class like this

class UWrappedQuantity(Quantity):
    @property
    def nominal_value(self):
        return self.value.nominal_value * self._SI_SI_unitary_quantity
    @property
    def std_dev(self):
        return self.value.std_dev * self._SI_SI_unitary_quantity

xq2 = UWraUWrappedQuantitypped(x, Dimension("m"))
print(xq2)
print(xq2.nominal_value)
0.200+/-0.010 m
0.2 m

But with this definition, newly created instance will be of type Quantity, not UWrappedQuantity (because for now, all operations that involve a Quantity are returned as another Quantity), loosing again the unit on nominal_value:

print(xq2+2*m)
print((xq2+2*m).nominal_value)
2.200+/-0.010 m
2.2

For this to work, a value-backend interface is needed, which I am currently working on also. If this post is of interest, I'll update it with the value backend interface.

lebigot commented 2 years ago

This looks like a great project.

I was trying to see how I could help.

One point is the un-hashability of numbers in uncertainties. Without looking at my code, I can't think of any reason why they couldn't be hashable (specifically, I'm guessing that the dictionary that holds the derivatives/slopes of an AffineScalarFunction could be a frozen dictionary with no impact on performance or memory usage). I'm not sure if there is any widely accepted implementation of frozen dictionaries, though, so this may not be a practical idea.

Another point is your remark on operations involving a Quantity returning a Quantity. I'm not seeing any simple semantics for the result of operations between (possibly different) sub-classes of Quantity. The best I can think of would be to return an object of the most "specialized" subclass of Quantity (= deepest subclass), when this makes sense (i.e. when all the Quantities are in the same chain of subclasses)—and otherwise fall back to returning a Quantity.

Is there another point where I could help?

What do you think of the above?

mocquin commented 2 years ago

FYI, outside the scope of the backend interface, I updated a notebook about support of uncertainties here. I think if some work should be done to make the packages compatible, it should start here, the backend interface will come after to fix the left-over. Maybe we could start by defining a list of test cases (based on functionnalities exposed here) and check how all of those are handled when using uncertainties with units ?

Regarding the hash-ability, this is just an arbitrary choice on my side. The implementation could probably be done with string or other structure, but a dictionnary with keys being the class seemed simple and efficient.

About the hierarchy of how subclasses should be returned, I agree it might not be simple for numerous subclasses. I consider that physipy only handles numerical-valued object, and let those handle the type returned (like floats and numpy array, but it is a bit cheating because numpy is well integrated in the source code of physipy). Then the backend interface comes into play and simply maps the returned numerical-valued object (like a numpy array, or an AffineScalarFunc) to the appropriate Quantity subclass.

MichaelTiemannOSC commented 2 years ago

I am working on integrating uncertainties into Pint, with some success (and some challenge). Your work is inspirational!