pytest-dev / pytest

The pytest framework makes it easy to write small tests, yet scales to support complex functional testing
https://pytest.org
MIT License
12.17k stars 2.7k forks source link

`pytest.approx` fails with `TypeError` when testing approximate value of a `decimal.Decimal` #8495

Open labrys opened 3 years ago

labrys commented 3 years ago

description

When testing the approximate value of a decimal.Decimal the following TypeError occurrs TypeError: unsupported operand type(s) for -: 'float' and 'decimal.Decimal'

I would expect pytest.approx to test for approximate equality with a floating point value without having to either convert the decimal to a float or testing explicitly against another decimal.

pip list

Package      Version
------------ -------
atomicwrites 1.4.0
attrs        20.3.0
colorama     0.4.4
iniconfig    1.1.1
packaging    20.9
pip          21.0.1
pluggy       0.13.1
py           1.10.0
pyparsing    2.4.7
pytest       6.2.2
setuptools   54.2.0
toml         0.10.2

pytest and os versions

pytest: 6.2.2 os: windows 10

minimal example

import decimal
import pytest
import math

3.14159 == pytest.approx(math.pi)  # works
decimal.Decimal('3.14159') == pytest.approx(math.pi)  # raises TypeError
RonnyPfannschmidt commented 3 years ago

nobody is working on it - this one requires the specification of the intended behavior of decimal vs float in approximation ranges

the problem triggered is that

>>> decimal.Decimal(2) - .4
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for -: 'decimal.Decimal' and 'float'

happens as part of the calculations/relation in ==

so even if we add support for this there needs to be a warning that we coerced structurally different number types to match the approximation

but basically we need to ensure that expected and actual are of the same type, and warn when we have to coerce in the case of decimal

this can be set up with a simple unittest that just puts the assertion into either one of the existing approx tests, or a new one

asottile commented 3 years ago

my 2c: it seems odd to test the "approximate" value of an exact value (decimal) -- approx is meant to deal with floating point precision which isn't a problem with decimal

kalekundert commented 3 years ago

My understanding is that Decimal is still a floating-point representation, just in decimal space rather than binary space. So things like 0.1 + 0.2 == 0.3 won't have rounding issues, but numbers that can't be represented exactly as decimals still could. I haven't looked for any examples, though.

RonnyPfannschmidt commented 3 years ago

@asottile i primarily think we ought to support it in some way as we loosened up the strictness of approx to be a approachable general tools,

we should definitively warn when we approx compare floats vs Decimals, but we shouldn't just plainly error out

asottile commented 3 years ago

My understanding is that Decimal is still a floating-point representation

this understanding is incorrect. A Decimal represents exact arbitrary precision (like integer). floating point is an approximation, not an exact value

we should definitively warn

I think we should error, expecting an approximate value from an exact representation is a programming mistake (we should also error for integers, but that's another thing entirely)

kalekundert commented 3 years ago

Here's an example of floating point rounding error with the Decimal class:

>>> x = Decimal(1) / Decimal(3)
>>> x + x + x
Decimal('0.9999999999999999999999999999')

I don't think there should be errors or warnings for comparing float with Decimal. approx is just saying that one number should be near another, and there's nothing conceptually unclear about those two numbers having different underlying representations. To put it another way, python allows seamless comparisons between float and Decimal, so I think it makes sense to mimic that.

asottile commented 3 years ago

float's epsilon makes no sense in the context of decimal (as they have arbitrary precision)

fortunately you can still use approx with decimals, so I guess this just needs docs:

>>> decimal.Decimal('.998') == pytest.approx(decimal.Decimal('0.999'), rel=decimal.Decimal('.01'))
True
>>> decimal.Decimal('.998') == pytest.approx(decimal.Decimal('0.999'), rel=decimal.Decimal('.0001'))
False
labrys commented 3 years ago

@asottile I would argue that the purpose of approx is to provide a clean pythonic method of testing that one value is approximately equal to another value (to some value of precision, whether implied, absolute or relative) and that Python, being a duck-typed language should not expect type rigidity.

asottile commented 3 years ago

@labrys python has strong types, as evidenced by the TypeError coming from python's standard library here

labrys commented 3 years ago

TypeError coming from the standard library is not because a desire for strong typing, but because it does not have the context to know how you should handle this scenario.

Edit: Or to put it another way, the strong typing is there to enable correct implementation of dynamic typing.

asottile commented 3 years ago

sorry but that's nonsense, Decimal intentionally forbids operations involving floats because they are strong, separate types and combining them would be nonsense

asottile commented 3 years ago

https://wiki.python.org/moin/Why%20is%20Python%20a%20dynamic%20language%20and%20also%20a%20strongly%20typed%20language

labrys commented 3 years ago

No, it forbids operations involving floats because you would need to opt in to the behavior to avoid accidentally invalidating the precision of a decimal. To quote your link:

The advantage to a strongly typed language is that you can trust what's going on: if you do something wrong, your program will generate a type error telling you where you went wrong, and you don't have to memorize a lot of arcane type-conversion rules or try to debug a situation where your variables have been silently changed without your knowledge.

Without context it can't know what the correct way to handle such a scenario would be. In our case we know the context. We want to know if two numbers are approximately equal to some value of precision.

Also to quote the BDFL:

Strong typing catches many bugs, but it also makes you focus too much on getting the types right and not enough on getting the rest of the program correct.

Instead of arguing this as its a difference of opinion, I will instead give some background on how the issue came up. I am working with a parser that can encode and decode data. The use of the decimal type is to provide a guarantee that the value parsed can be encoded back to the original data. How the user chooses to use the parsed value is entirely up to them. I already test that the encoding the decoded data and vice-versa produce the same results. The use of approx was to test that the decoded value is as expected.

kalekundert commented 3 years ago

It's worth mentioning again that Decimal explicitly allows comparisons with float:

>>> from decimal import Decimal
>>> Decimal(1) > 0.5
True
>>> Decimal(1) + 0.5
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'decimal.Decimal' and 'float'

Since approx() only uses comparisons, it does make sense that this operation should succeed.

asottile commented 3 years ago

The use of the decimal type is to provide a guarantee that the value parsed can be encoded back to the original data. How the user chooses to use the parsed value is entirely up to them. I already test that the encoding the decoded data and vice-versa produce the same results. The use of approx was to test that the decoded value is as expected.

surely you should be testing the exact value if you have an expected behaviour that the exact value round-trips? approx is a poor choice to validate that

Since approx() only uses comparisons, it does make sense that this operation should succeed.

this isn't the case, approx() adjusts the value to create a range and then performs a comparison. moving a value by an epsilon is a key component of approx

labrys commented 3 years ago

The serialized form is not type-aware. If there is a decimal value in the data (note not Decimal) then the need for retaining the precision by using a float or a Decimal is unknown. By default I am using the Decimal class so that a round-trip will produce the same results. I already test exact values elsewhere. I am also fuzzing serialized data for additional testing. This came up during the testing of fuzzed data when I did not have an exact type available to me. I could potentially rewrite the fuzzing to expose the type, however I felt that use of approx would be cleaner and more maintainable.

Since approx() only uses comparisons, it does make sense that this operation should succeed.

I think this would be more accurately worded: "The purpose for approx() is to perform a comparison..."

this isn't the case, approx() adjusts the value to create a range and then performs a comparison. moving a value by an epsilon is a key component of approx

That approx moves a value by an epsilon is an implementation detail that the user should not be required to know.

kalekundert commented 3 years ago

this isn't the case, approx() adjusts the value to create a range and then performs a comparison. moving a value by an epsilon is a key component of approx

Yes, the tolerance is added to/subtracted from the expected value. So I agree that you shouldn't be able to use a Decimal tolerance with a float expected value, or vice versa. But only comparisons are made between the actual value and the tolerance-adjusted expected value, so there's no reason why the example given in this issue shouldn't work:

>>> decimal.Decimal('3.14159') == pytest.approx(math.pi)

This is more-or-less equivalent to:

>>> math.pi - 1e-5 < Decimal('3.14159') < math.pi + 1e-5
True
labrys commented 3 years ago

The docstring for approx also has examples of comparing different types:

Both the relative and absolute tolerances can be changed by passing arguments to the ``approx`` constructor::

1.0001 == approx(1) False

And for a ``numpy`` array against a scalar::

import numpy as np # doctest: +SKIP np.array([0.1, 0.2]) + np.array([0.2, 0.1]) == approx(0.3) # doctest: +SKIP True

asottile commented 3 years ago

This is more-or-less equivalent to:

it is in the common case but not always (due to overridden operators). the numpy types are a good example of exactly this

asottile commented 3 years ago

also it wouldn't work for the same precision reasons that decimals are useful:

>>> d = decimal.Decimal('11111111111111111.')
>>> d == 11111111111111111.
False
>>> 11111111111111111. - .1 <= d <= 11111111111111111. + .1
False
>>> d == decimal.Decimal('11111111111111111.')
True
>>> decimal.Decimal('11111111111111111.') - decimal.Decimal('.1') <= d <= decimal.Decimal('11111111111111111.') + decimal.Decimal('.1')
True
labrys commented 3 years ago

@asottile which is why I turned to approx to give me a reasonable implementation without having to roll my own with all associated tests to validate it.

asottile commented 3 years ago

right, what I'm saying is you'd have to do this anyway:

pytest.approx(decimal.Decimal('whatevervalue'), rel=decimal.Decimal('whateverrel'))

which already works (because floats can't help you here)

kalekundert commented 3 years ago

I just opened a PR to close this issue. @asottile was right that it's not as simple as just using the comparison operators instead of the subtraction operators, although numpy in fact has nothing to do with it. The real problem is complex numbers, which don't implement the comparison operators, but do implement the sutraction and absolute value operators. For that reason, approx is currently implemented as abs(expected - actual) < tol. This expression has to remain, because there's no other way to support complex numbers. But we can fall back on the equivalent expected - tol <= actual <= expected + tol if the first expression raises a TypeError.

Saying "floats can't help you here" is hyperbolic. Floats can help in the vast majority of cases where the tolerance is larger than the precision of the expected value. That includes the example motivating this issue. What it comes down to for me is that python supports comparisons between Decimal and float, so the fact that pytest doesn't is confusing and should be fixed.

asottile commented 3 years ago

in my opinion "sometimes working" is worse as it gives users a false sense of correctness (and can hide bugs)

labrys commented 3 years ago

The fact that correctly implementing this is non-trivial is exactly why I think that it should be supported. If these types of comparisons were simple everyone would just write assert a == b and be done with it and we wouldn't need approx, math.isclose or any of the other variants. From an English language perspective none of the following would raise an eye:

0.9999999 is approximately 1
0.6666666 is approximately 2/3
2/3 + 1/3 is approximately 1
1100 is approximately 1000

While they might lend one to ask "How precisely?", none of those require you to ask "Are they floats or decimals?".

Yet in pytest:

import decimal
import fractions

import pytest

assert 0.9999999 == pytest.approx(1)    # True
assert 0.6666666 == pytest.approx(2/3)  # True
assert 2/3 + 1/3 == pytest.approx(1)    # True

assert fractions.Fraction(0.9999999) == pytest.approx(1)    # True
assert fractions.Fraction(0.6666666) == pytest.approx(2/3)  # True
assert fractions.Fraction(2, 3) + fractions.Fraction(1, 3) == pytest.approx(1)  # True

assert decimal.Decimal('0.9999999') == pytest.approx(1)  # TypeError
assert decimal.Decimal('0.9999999') == pytest.approx(0.9999999)  # TypeError

As @RonnyPfannschmidt mentioned, I think warning when comparing a decimal to a non-decimal is perfectly valid and probably even expected behavior. Additionally if any type coercion is required it could also be expressed in approx's __repr__ for more complete information as to whats being coerced.

For example I think an approach where the expected, rel, and abs args are coerced to Decimals with associated warning and a __repr__ that shows that coercion would make sense. Additionally, if so desired, an optional kwarg such as strict=True could be added that would maintain the current TypeError unless explicitly relaxed. This has the added benefit of maintaining the current rigorous definition to avoid breaking any existing usages while allowing for an opt-in loss of precision to continue to make approx an approachable tool for testing.

kalekundert commented 3 years ago

@asottile That's a good example, but it's not specific to comparisons between decimals and floats. You can get the same kind of errors with the current implementation of approx() using just floats:

>>> 1e17 + 2 == pytest.approx(1e17, abs=1)
True

And just decimals:

>>> Decimal('1e28') + 2 == pytest.approx(Decimal('1e28'), abs=1)
True

This problem boils down to the fact that if the tolerance is below the precision of the expected value, the approximate comparison becomes a direct equality comparison, with all the pitfalls that entails. If this really bothers you, we could eliminate this behavior by checking that expected + tol != expected. But my view is that it doesn't make sense to use approx() with tolerances at the limits of floating point precision, so worrying about such cases is a waste of time.

@labrys There's no need for any warning. Python allows comparisons between floats and decimal without warning or error, so pytest should do the same.