davidhewitt / pythonize

MIT License
198 stars 28 forks source link

Relax input requirements in deserialize_str #40

Open ShaneMurphy2 opened 1 year ago

ShaneMurphy2 commented 1 year ago

Instead of attempting to downcast the input to a PyString, rely on the fact that for str objects, calling __str__ is a no-op, while for other objects it may allow us to support neat use cases.

An example of one such use case is turning Python's uuid.UUID into Rust's uuid::Uuid. If you call Uuid::depythonize you end up in \<Uuid as Deserialize>::deserialize (if you enabled its serde feature). Since Depythonizer doesn't override Deserialize::is_human_readable, the implementation tries to deserialize with deserialize_str. At that point, the Deserializer has a value of type uuid.UUID (actually PyAny, but it really is uuid.UUID), and the visitor expects a value of type &str. We can make that happen by using str.

ShaneMurphy2 commented 1 year ago

Opened this PR but I'm not sure this is the right way to do this, or if it's even broadly useful. Wanted to get some feedback.

Generally, I think this approach is sound, as it mirrors the way deserialize_bool relies on is_true as opposed to a strict cast.

An alternate idea I had would be to maintain a list of "known types" that we want to stringify, but that could get expensive.

ShaneMurphy2 commented 1 year ago

I just realized there's no way to implement the serialization half of this, since there is nothing in the serde data model that tells us the original Rust type was a Uuid. This might still be a good change, but I'll wait for feedback.

davidhewitt commented 1 year ago

Hello, thanks for the PR!

So I think this use-case is reasonable but I'm a bit nervous about introducing this coercion globally. Instead would you be prepared to amend this PR to add a depythonize_custom API which takes a setting to enable this behaviour? That way we can have the default be unchanged.

The is_true comparison you draw above is incorrect IIRC - I believe PyAny::is_true() only checks for the True singleton rather than truthy objects.

ShaneMurphy2 commented 1 year ago

Hello, thanks for the PR!

:)

So I think this use-case is reasonable but I'm a bit nervous about introducing this coercion globally. Instead would you be prepared to amend this PR to add a depythonize_custom API which takes a setting to enable this behaviour? That way we can have the default be unchanged.

Totally, I was considering feature flags/custom for this.

The is_true comparison you draw above is incorrect IIRC - I believe PyAny::is_true() only checks for the True singleton rather than truthy objects.

The docs say "This is equivalent to the Python expression bool(self)".

davidhewitt commented 1 year ago

The docs say "This is equivalent to the Python expression bool(self)".

Right you are; I had peeked at the implementation and mixed up PyObject_IsTrue (which is equivalent to the above), versus Py_IsTrue (which would be equivalent to self is True).

ShaneMurphy2 commented 1 year ago

I did some further pondering about the usefulness of this change with regards to UUIDs. I got wrapped around the axle with serde, pythonize, and {To/From}PyObject. Preserving non serde data model types through serialization isn't really in scope for pythonize I think. Really what I was looking for was a custom implementation of FromPyObject.

That being said, I think that being more permissive with what is considered a str coming from Python is sensible.