Open FeldrinH opened 5 months ago
PS: There are a number of similar issues with Decimal decoding that are marked as resolved (https://github.com/pydantic/pydantic/issues/6807, https://github.com/pydantic/pydantic/issues/6295). As far as I can tell they are distinct from this issue (most importantly, those issues are resolved whereas this issue is present in the latest Pydantic version).
@FeldrinH,
Thanks for reporting. Definitely looks like a bug, and I'm guessing will have to be fixed in pydantic-core
. Adding this to our 2.8 milestone, and marking as a good first issue
for anyone interested!
I can take this one.
As an aside, the following test cases pass:
@pytest.mark.parametrize(
'value',
[
Decimal(1.234567890123456789012345678901234567890),
Decimal(12345678901234567890123456789012345678.9),
Decimal(1) / Decimal(7)
]
)
def test_long_decimal_decoding(value: Decimal) -> None:
"""
Really large decimal values should not be lost when encoding or decoding from json (or other input formats).
"""
class Obj(BaseModel):
value: Decimal
m = Obj.model_validate_json(json.dumps({"value": value.real}, default=str))
assert m.value.real == value
But if the values are provided as raw floats, not Decimal
, then they fail. Something to note,.
Found where the behavior is being caused:
primitive_schema = core_schema.union_schema(
[
# if it's an int keep it like that and pass it straight to Decimal
# but if it's not make it a string
# we don't use JSON -> float because parsing to any float will cause
# loss of precision
core_schema.int_schema(strict=True), # <--------------------- HERE
core_schema.str_schema(strict=True, strip_whitespace=True),
core_schema.no_info_plain_validator_function(str),
],
)
Also, this test works in reverse, lossiness on json encoding too:
m = Test(value=Decimal(1.234567890123456789012345678901234567890))
print(m.model_dump_json())
# expected output: {"value":"1.234567890123456789012345678901234567890"}
# actual output: {"value":"1.2345678901234566904321354741114191710948944091796875"}
Alright, got a solution going. Need some help with the deserialization component. https://github.com/pydantic/pydantic/pull/9291
Alright, got a solution going. Need some help with the deserialization component. #9291
Welp! That was on a previous release of pydantic. Back to square one, working through it.
At PyCon 2024, looking at this now. Looks like PR #9292 stalled with comments, so I'll see if I can get it finished.
Initial Checks
Description
When decoding a JSON number into a Python Decimal the precision seems to be limited. After a certain number of digits the value is cut off. This is something I would expect for a fixed precision float, but not for an arbitrary precision Decimal. This only happens with numbers that contain a decimal point. I assume what happens is that the number is internally converted to a float and then a Decimal.
As a user, this lossy internal conversion is an unexpected and unwelcome surprise. Both the initial JSON string and the final Decimal can contain the full precision of the value without loss, so I would have expected the conversion to be lossless as well.
PS: I'm not sure if this is a bug per se because I could not find any documentation that explicitly states what the expected behavior is. However, based on what was written in the documentation it certainly was unexpected to me.
Example Code
Python, Pydantic & OS Version