danielgtaylor / python-betterproto

Clean, modern, Python 3.6+ code generator & library for Protobuf 3 and async gRPC
MIT License
1.5k stars 211 forks source link

to_dict() method converts integer field values to strings #36

Closed ahobsonsayers closed 1 month ago

ahobsonsayers commented 4 years ago

When using the to_dict method of a message, integer fields are converted to strings. This can be see on this line in the source code.

Why are integers converted to strings, while all other types are left as is? Is this a bug that i can open a PR for, or it deliberate? Any help is much appreciated as this seems like a bit odd behaviour

nat-n commented 4 years ago

That seems odd, any insight? @danielgtaylor

jameslan commented 4 years ago

For 64 bits int, spec requires it to be a string.

There was a discussion on protobuf repo.

But I do not quite understand the list handling of int64 types. @danielgtaylor

boukeversteegh commented 4 years ago

But I do not quite understand the list handling of int64 types

As far as I can see, the list implementation is equivalent to the single int64 type. It just converts all ints in the list to a string. Or is something else unclear about it?

jameslan commented 4 years ago

Oh that's for repeated. I forgot about it. Then there's no question about the code.

samschlegel commented 4 years ago

One thing I know we found confusing when looking into betterproto is that to_dict converts everything to the JSON representation, but that isn't really clearly documented. It makes sense with how to_json just json.dumps the output of to_dict, since without this conversion you'd have no way to ensure it matches the spec, but it's confusing when you're expected to_dict to return pythonic values like datetime or in this case an int and leads this behavior being interpreted as a bug

Maybe adding a separate to_python that would fairly simply convert to pythonic types, and then possibly rename to_dict to to_json_dict or something of the like would be a good feature request to clear up this confusion?

boukeversteegh commented 4 years ago

One thing I know we found confusing when looking into betterproto is that to_dict converts everything to the JSON representation, but that isn't really clearly documented. It makes sense with how to_json just json.dumps the output of to_dict, since without this conversion you'd have no way to ensure it matches the spec, but it's confusing when you're expected to_dict to return pythonic values like datetime or in this case an int and leads this behavior being interpreted as a bug

That makes a lot of sense!

I'm not entirely sure what the use-case is of to_dict, other than as a stepping stone for json output, but perhaps people use it for their own serialization methods? any insight into this @ahobsonsayers?

depending on how users use to_dict, it could either be useful to convert everything to low-level values, or actually keep it pythonic.

jhowarth commented 4 years ago

@boukeversteegh One use case is to convert protobuf messages into domain objects with an object mapper. Most object mappers take dictionaries as arguments.

v1b1 commented 3 years ago

Any updates for this? Currently on 2.0.0b2.

My particular use case imports YAML, validates each property based on the dataclass field types, does some processing based on the field name/type, and passes the output of to_dict(casing=...) to a third-party module. Right now I have to perform a post-processing step on the output of to_dict() by traversing the tree and converting specific values but I'd rather not have to maintain that code.

jonmather commented 3 years ago

Would also +1 this issue. There's a related issue on how integers should be parsed as keys for map types in from_dict() - currently all treated as strings #235

mbrancato commented 6 months ago

I know it was mentioned that the spec calls this out, but that either isn't true or was changed. The spec clear says that protobuf int64 values should be int/long in Python.

[4] 64-bit or unsigned 32-bit integers are always represented as long when decoded, but can be an int if an int is given when setting the field. In all cases, the value must fit in the type represented when set. See [2].
Gobot1234 commented 1 month ago

This has been fixed by adding to_pydict which will have the expected behaviour