jcrist / msgspec

A fast serialization and validation library, with builtin support for JSON, MessagePack, YAML, and TOML
https://jcristharif.com/msgspec/
BSD 3-Clause "New" or "Revised" License
2.01k stars 59 forks source link

decode's strict=False does not cast floats to ints #613

Closed thomasjpfan closed 6 months ago

thomasjpfan commented 6 months ago

Description

On msgspec==0.18.4, strict=False does not cast floats to ints:

import msgspec

msgspec.json.decode(b'[1, 2, 3.0]', type=list[int], strict=False)
---------------------------------------------------------------------------
ValidationError                           Traceback (most recent call last)
Cell In[38], line 3
      1 import msgspec
----> 3 msgspec.json.decode(b'[1, 2, 3.0]', type=list[int], strict=False)

ValidationError: Expected `int`, got `float` - at `$[2]`

Is this expected behavior with strict=False?

jcrist commented 6 months ago

Currently, yeah. I could see adding support for coercing floats without a decimal component to integers though if needed (so 3.0 would be valid, but 3.5 would error). Can you say more about your use case here?

thomasjpfan commented 6 months ago

I'm ingesting in a format that deserializes all numerical values to floats, while the original type was an int. Since the original data were ints the floats end up being: 1.0, 6.0, etc.

At a high level, the data serialization is:

  1. Original data in Python that contain ints
  2. Serialize Python data into protobuf (the ints are preserved in this format)
  3. Deserialize custom format into a json (where the ints gets converted into floats)
  4. msgspec to convert json back into a Python type

I sense there is a bug in 3, so I likely do not require msgspec casting floats into ints.

jcrist commented 6 months ago

I've added support for this in #619. Even though you said this likely wasn't necessary, it felt like something we could/should support (and supporting it was easy to do). Thanks for raising the issue!