Open labrys opened 3 years ago
nobody is working on it - this one requires the specification of the intended behavior of decimal vs float in approximation ranges
the problem triggered is that
>>> decimal.Decimal(2) - .4
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for -: 'decimal.Decimal' and 'float'
happens as part of the calculations/relation in ==
so even if we add support for this there needs to be a warning that we coerced structurally different number types to match the approximation
but basically we need to ensure that expected and actual are of the same type, and warn when we have to coerce in the case of decimal
this can be set up with a simple unittest that just puts the assertion into either one of the existing approx tests, or a new one
my 2c: it seems odd to test the "approximate" value of an exact value (decimal) -- approx is meant to deal with floating point precision which isn't a problem with decimal
My understanding is that Decimal
is still a floating-point representation, just in decimal space rather than binary space. So things like 0.1 + 0.2 == 0.3
won't have rounding issues, but numbers that can't be represented exactly as decimals still could. I haven't looked for any examples, though.
@asottile i primarily think we ought to support it in some way as we loosened up the strictness of approx to be a approachable general tools,
we should definitively warn when we approx compare floats
vs Decimals
, but we shouldn't just plainly error out
My understanding is that Decimal is still a floating-point representation
this understanding is incorrect. A Decimal represents exact arbitrary precision (like integer). floating point is an approximation, not an exact value
we should definitively warn
I think we should error, expecting an approximate value from an exact representation is a programming mistake (we should also error for integers, but that's another thing entirely)
Here's an example of floating point rounding error with the Decimal
class:
>>> x = Decimal(1) / Decimal(3)
>>> x + x + x
Decimal('0.9999999999999999999999999999')
I don't think there should be errors or warnings for comparing float
with Decimal
. approx
is just saying that one number should be near another, and there's nothing conceptually unclear about those two numbers having different underlying representations. To put it another way, python allows seamless comparisons between float and Decimal
, so I think it makes sense to mimic that.
float's epsilon makes no sense in the context of decimal (as they have arbitrary precision)
fortunately you can still use approx
with decimals, so I guess this just needs docs:
>>> decimal.Decimal('.998') == pytest.approx(decimal.Decimal('0.999'), rel=decimal.Decimal('.01'))
True
>>> decimal.Decimal('.998') == pytest.approx(decimal.Decimal('0.999'), rel=decimal.Decimal('.0001'))
False
@asottile I would argue that the purpose of approx
is to provide a clean pythonic method of testing that one value is approximately equal to another value (to some value of precision, whether implied, absolute or relative) and that Python, being a duck-typed language should not expect type rigidity.
@labrys python has strong types, as evidenced by the TypeError coming from python's standard library here
TypeError
coming from the standard library is not because a desire for strong typing, but because it does not have the context to know how you should handle this scenario.
Edit: Or to put it another way, the strong typing is there to enable correct implementation of dynamic typing.
sorry but that's nonsense, Decimal intentionally forbids operations involving floats because they are strong, separate types and combining them would be nonsense
No, it forbids operations involving floats because you would need to opt in to the behavior to avoid accidentally invalidating the precision of a decimal. To quote your link:
The advantage to a strongly typed language is that you can trust what's going on: if you do something wrong, your program will generate a type error telling you where you went wrong, and you don't have to memorize a lot of arcane type-conversion rules or try to debug a situation where your variables have been silently changed without your knowledge.
Without context it can't know what the correct way to handle such a scenario would be. In our case we know the context. We want to know if two numbers are approximately equal to some value of precision.
Also to quote the BDFL:
Strong typing catches many bugs, but it also makes you focus too much on getting the types right and not enough on getting the rest of the program correct.
Instead of arguing this as its a difference of opinion, I will instead give some background on how the issue came up. I am working with a parser that can encode and decode data. The use of the decimal type is to provide a guarantee that the value parsed can be encoded back to the original data. How the user chooses to use the parsed value is entirely up to them. I already test that the encoding the decoded data and vice-versa produce the same results. The use of approx was to test that the decoded value is as expected.
It's worth mentioning again that Decimal explicitly allows comparisons with float:
>>> from decimal import Decimal
>>> Decimal(1) > 0.5
True
>>> Decimal(1) + 0.5
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'decimal.Decimal' and 'float'
Since approx()
only uses comparisons, it does make sense that this operation should succeed.
The use of the decimal type is to provide a guarantee that the value parsed can be encoded back to the original data. How the user chooses to use the parsed value is entirely up to them. I already test that the encoding the decoded data and vice-versa produce the same results. The use of approx was to test that the decoded value is as expected.
surely you should be testing the exact value if you have an expected behaviour that the exact value round-trips? approx
is a poor choice to validate that
Since
approx()
only uses comparisons, it does make sense that this operation should succeed.
this isn't the case, approx()
adjusts the value to create a range and then performs a comparison. moving a value by an epsilon is a key component of approx
The serialized form is not type-aware. If there is a decimal value in the data (note not Decimal) then the need for retaining the precision by using a float or a Decimal is unknown. By default I am using the Decimal class so that a round-trip will produce the same results. I already test exact values elsewhere. I am also fuzzing serialized data for additional testing. This came up during the testing of fuzzed data when I did not have an exact type available to me. I could potentially rewrite the fuzzing to expose the type, however I felt that use of approx
would be cleaner and more maintainable.
Since approx() only uses comparisons, it does make sense that this operation should succeed.
I think this would be more accurately worded: "The purpose for approx()
is to perform a comparison..."
this isn't the case, approx() adjusts the value to create a range and then performs a comparison. moving a value by an epsilon is a key component of approx
That approx
moves a value by an epsilon is an implementation detail that the user should not be required to know.
this isn't the case, approx() adjusts the value to create a range and then performs a comparison. moving a value by an epsilon is a key component of approx
Yes, the tolerance is added to/subtracted from the expected value. So I agree that you shouldn't be able to use a Decimal tolerance with a float expected value, or vice versa. But only comparisons are made between the actual value and the tolerance-adjusted expected value, so there's no reason why the example given in this issue shouldn't work:
>>> decimal.Decimal('3.14159') == pytest.approx(math.pi)
This is more-or-less equivalent to:
>>> math.pi - 1e-5 < Decimal('3.14159') < math.pi + 1e-5
True
The docstring for approx
also has examples of comparing different types:
Both the relative and absolute tolerances can be changed by passing arguments to the ``approx`` constructor::
1.0001 == approx(1) False
And for a ``numpy`` array against a scalar::
import numpy as np # doctest: +SKIP np.array([0.1, 0.2]) + np.array([0.2, 0.1]) == approx(0.3) # doctest: +SKIP True
This is more-or-less equivalent to:
it is in the common case but not always (due to overridden operators). the numpy types are a good example of exactly this
also it wouldn't work for the same precision reasons that decimals are useful:
>>> d = decimal.Decimal('11111111111111111.')
>>> d == 11111111111111111.
False
>>> 11111111111111111. - .1 <= d <= 11111111111111111. + .1
False
>>> d == decimal.Decimal('11111111111111111.')
True
>>> decimal.Decimal('11111111111111111.') - decimal.Decimal('.1') <= d <= decimal.Decimal('11111111111111111.') + decimal.Decimal('.1')
True
@asottile which is why I turned to approx
to give me a reasonable implementation without having to roll my own with all associated tests to validate it.
right, what I'm saying is you'd have to do this anyway:
pytest.approx(decimal.Decimal('whatevervalue'), rel=decimal.Decimal('whateverrel'))
which already works (because floats can't help you here)
I just opened a PR to close this issue. @asottile was right that it's not as simple as just using the comparison operators instead of the subtraction operators, although numpy in fact has nothing to do with it. The real problem is complex numbers, which don't implement the comparison operators, but do implement the sutraction and absolute value operators. For that reason, approx
is currently implemented as abs(expected - actual) < tol
. This expression has to remain, because there's no other way to support complex numbers. But we can fall back on the equivalent expected - tol <= actual <= expected + tol
if the first expression raises a TypeError
.
Saying "floats can't help you here" is hyperbolic. Floats can help in the vast majority of cases where the tolerance is larger than the precision of the expected value. That includes the example motivating this issue. What it comes down to for me is that python supports comparisons between Decimal and float, so the fact that pytest doesn't is confusing and should be fixed.
in my opinion "sometimes working" is worse as it gives users a false sense of correctness (and can hide bugs)
The fact that correctly implementing this is non-trivial is exactly why I think that it should be supported. If these types of comparisons were simple everyone would just write assert a == b
and be done with it and we wouldn't need approx
, math.isclose
or any of the other variants. From an English language perspective none of the following would raise an eye:
0.9999999 is approximately 1
0.6666666 is approximately 2/3
2/3 + 1/3 is approximately 1
1100 is approximately 1000
While they might lend one to ask "How precisely?", none of those require you to ask "Are they floats or decimals?".
Yet in pytest:
import decimal
import fractions
import pytest
assert 0.9999999 == pytest.approx(1) # True
assert 0.6666666 == pytest.approx(2/3) # True
assert 2/3 + 1/3 == pytest.approx(1) # True
assert fractions.Fraction(0.9999999) == pytest.approx(1) # True
assert fractions.Fraction(0.6666666) == pytest.approx(2/3) # True
assert fractions.Fraction(2, 3) + fractions.Fraction(1, 3) == pytest.approx(1) # True
assert decimal.Decimal('0.9999999') == pytest.approx(1) # TypeError
assert decimal.Decimal('0.9999999') == pytest.approx(0.9999999) # TypeError
As @RonnyPfannschmidt mentioned, I think warning when comparing a decimal to a non-decimal is perfectly valid and probably even expected behavior. Additionally if any type coercion is required it could also be expressed in approx
's __repr__
for more complete information as to whats being coerced.
For example I think an approach where the expected
, rel
, and abs
args are coerced to Decimals with associated warning and a __repr__
that shows that coercion would make sense. Additionally, if so desired, an optional kwarg such as strict=True
could be added that would maintain the current TypeError
unless explicitly relaxed. This has the added benefit of maintaining the current rigorous definition to avoid breaking any existing usages while allowing for an opt-in loss of precision to continue to make approx an approachable tool for testing.
@asottile That's a good example, but it's not specific to comparisons between decimals and floats. You can get the same kind of errors with the current implementation of approx()
using just floats:
>>> 1e17 + 2 == pytest.approx(1e17, abs=1)
True
And just decimals:
>>> Decimal('1e28') + 2 == pytest.approx(Decimal('1e28'), abs=1)
True
This problem boils down to the fact that if the tolerance is below the precision of the expected value, the approximate comparison becomes a direct equality comparison, with all the pitfalls that entails. If this really bothers you, we could eliminate this behavior by checking that expected + tol != expected
. But my view is that it doesn't make sense to use approx()
with tolerances at the limits of floating point precision, so worrying about such cases is a waste of time.
@labrys There's no need for any warning. Python allows comparisons between floats and decimal without warning or error, so pytest should do the same.
description
When testing the approximate value of a
decimal.Decimal
the followingTypeError
occurrsTypeError: unsupported operand type(s) for -: 'float' and 'decimal.Decimal'
I would expect
pytest.approx
to test for approximate equality with a floating point value without having to either convert the decimal to a float or testing explicitly against another decimal.pip list
pytest and os versions
pytest: 6.2.2 os: windows 10
minimal example