lmfit / uncertainties

Transparent calculations with uncertainties on the quantities involved (aka "error propagation"); calculation of derivatives.
http://uncertainties.readthedocs.io/
Other
576 stars 74 forks source link

isinstance checks with numbers ABCs #74

Open JesterEE opened 6 years ago

JesterEE commented 6 years ago

I am experimenting with uncertainties and found a situation that I'd like to highlight. I'm not sure if the behavior is by design or is actually an issue that should be tracked. I decided to submit the issue, open a dialog, and let the authors decide.

Here is some example code to highlight the issue:

import numbers
import uncertainties

class UTest:
    default_setting_value = 1

    def __init__(self, value):
        try:
            if isinstance(value, numbers.Real):
                self.setting = value
            else:
                raise ValueError
        except ValueError:
            print("'{}' setting was invalid.  Setting attribute to default: {}".format(value, self.default_setting_value))
            self.setting = self.default_setting_value

if __name__ == '__main__':
    test_value = uncertainties.ufloat(1, 1)
    t = UTest(test_value)

value can be any numeric value that matches the numbers.Real abstract base class. This can be an int, float, decimal.Decimal, numpy.int32, etc. But, it won't work with an uncertainties.core.Variable. Using the above as an example, The isinstance check will fail even though 1.0 +/- 1.0 is actually a real number. If on the other hand:

test_value = uncertainties.ufloat(1, 1).nominal_value

OR

test_value = uncertainties.ufloat(1, 1).std_dev

The float and uncertainties.core.CallableStdDev types respectively will successfully pass the isinstance check. Downstream some extra code will be required to use an uncertainty interchangeably with any other numeric type (which is another issue entirely), but I think this check should pass.

UPDATE: decimal.Decimal does not satisfy numbers.Real. This was a mistake ... see my post below for another example.

lebigot commented 6 years ago

Thank you for sharing this.

I find the behavior normal, for the following reason: all the cases that you cite can unambiguously be cast as a real number (integer, decimal, numpy.int32). This is not the case for 1±0.1: if it is transformed into a real number (into 1), then we amputate the value from its uncertainty and the number has changed. Thus, a number with uncertainty is not a real number. Therefore it is important that isinstance(1±0.1, numbers.Real) be false. This is, as far as I know, an important semantic implication of subclasses (e.g., a Square can be subclass of a Shape class because a Square is a specific Shape).

In other words, if something is a real, I would never expect it to have an uncertainty: real numbers are infinitely precise, so it is important that values with uncertainties are not considered reals. Technically, in uncertainties, 1±0.1 is a short summary of a probability distribution (with standard deviation 0.1).

Concretely, in your case, you want to do isinstance(value, (numbers.Real, uncertainties.UFloat)).

The above is my current understanding, but I would be happy to have uncertainties.Ufloat inherit from some abstract base class of numbers if this actually makes sense. I'll ask a question on StackOverflow to see what the community thinks.

JesterEE commented 6 years ago

I think I understand what you're saying here and I generally agree. A number with an uncertainty has infinite possible values and thus is never truly any specific number. I also agree with your semantic implication of subclassing. I think where we possibly disagree is with the incorporation of numeric classification with these computer science concepts.

This next part got a little long ... I apologize, but bear with me.

When testing the nominal and standard deviation values from an uncertainties instance of UFloat, they are classified as a float +/- float. In python, floating point numbers are represented as "base 2 (binary) fractions". Meaning, that in hardware, a representation of a given float is the sum of a particular set of rational numbers. When being stored in memory, the number is evaluated from a sum of fractional forms into a floating point representation. The resulting number in memory is considered Real instead of possibly Rational (e.g. a single fraction or exact decimal representation) regardless of the number for fractions being summed as the interpreter loses track of the exact (or inexact) fractional form.

For example, the code below holds a simple set of tests for some numeric representations of 1/4. Note the float representation of rational 1/4 and 0.25, pass the ['Number', 'Complex', 'Real', 'float'] tests but not the 'Rational' test even when the exact fraction (1/4) and exact decimal (0.25) are given as a test cases (this is one of the main use cases of the fractions module). Note, this was run in Python3, so the fraction of 2 ints can result in a float. In Python2, it would need to be 1/4.0.

As we just saw, to python, floats are instances of numbers.Real. The axioms of real numbers state that the field is closed under addition and subtraction (like an uncertainty). So, any uncertainty based on a float must also be a float and thus also a real number. This was the main reason I thought that uncertainties should be considered numbers.Real. If, the uncertainty was based on a complex number (which isn't supported by uncertainties), this may or may not be the case, depending on the imaginary components of the nominal and standard deviations.

import decimal
import fractions
import math
import numbers

num_test_dict = {
                 'Number': numbers.Number,
                 'Complex': numbers.Complex,
                 'Real': numbers.Real,
                 'Rational': numbers.Rational,
                 'Integral': numbers.Integral,
                 'int': int,
                 'float': float
                }

n_tuple= (
          fractions.Fraction(1, 4),
          decimal.Decimal(1/4),
          decimal.Decimal(0.25),
          1/4,
          0.25
         )

for n in n_tuple:
    number_type_matches = []
    for k, t in num_test_dict.items():
        if isinstance(n, t):
            number_type_matches.append(k)
    print("{} matches: {}".format(n, number_type_matches))

>>> 1/4 matches: ['Number', 'Complex', 'Real', 'Rational']
>>> 0.25 matches: ['Number']
>>> 0.25 matches: ['Number']
>>> 0.25 matches: ['Number', 'Complex', 'Real', 'float']
>>> 0.25 matches: ['Number', 'Complex', 'Real', 'float']

With all that said, I am not a mathematician or a computer scientist. This is just what made sense to me. Also note from the example above, that the decimal module does not match numbers.Real either (though for the same reasons I think it should) ... but does satisfy numbers.Number. While I stand by the reasoning for uncertainties.UFloat matching numbers.Real, can we agree that a number with an uncertainty is in fact a number? If so, numbers.Number should, at the very least, be an ABC of UFloat.

lebigot commented 6 years ago

Interesting tests, Matt!

I am surprised too that Decimal(0.25) is not a Real, because this does not match the mathematical meaning of these terms. The examples are as if Real actually meant float (because not all decimals are floats)…

I agree that an uncertainty is a float (isinstance(value_with_uncertainty.std_dev, float) is true).

Now, I was arguing in my previous post that a number with an uncertainty is not a number but represents some information on a probability distribution. This is the mathematical point of view.

Now, what matters within Python is not the mathematical point of view but the intended meaning of being a subclass of Real, Number, etc. I haven't asked the question publicly yet but I'll do that. Your surprising Decimal example above will be useful. As a consequence, I am keeping this issue open, because you're raising an interesting question.

JesterEE commented 6 years ago

I get what you are saying now. I think getting more opinions is important, so please reach out to the community on this.

I would like to see uncertainties tie into at least part of the python built-in numbers ABCs so that there is at least one built-in check that meets the definition of an uncertainty. This helps when making a package/module optional since something like isinstance(value, (numbers.Number, uncertainties.UFloat)) would throw a NameError and then a developer has to provide custom logic to handle importing and a compatibility layer when all that's really cared about is that it acts like a number. Also, it makes checks much more manageable. When duck typing isn't possible or becomes overly complicated and type checking is necessary to get the logic to do what you want, it quickly gets messy. Having a simple ABC for the type of instance makes things easier on the eyes for writing and debugging.

lebigot commented 6 years ago

Agreed on all points. I will ask the community.