davidhewitt / pythonize

MIT License
198 stars 28 forks source link

IntoPy and FromPyObject integration #1

Open davidhewitt opened 4 years ago

davidhewitt commented 4 years ago

As currently designed, pythonize and depythonize on structs with both Serialize and IntoPy implementations will always use the Serialize implementation.

I'm not sure if that's surprising - it's certainly predictable though.

Open point of discussion as to whether pythonize should attempt to re-use existing IntoPy implementations, and if so, how could this be implemented?

1tgr commented 3 years ago

I had a crate (closed source unfortunately) that does the same as pythonize. The idea is to opt in to custom serialization/deserialization.

I exposed a choice as there are pros and cons with each:

  1. A module that can be used at the field level with #[serde(serialize_with = "as_py_object")]
  2. A SerdePyObject<T> wrapper which has its own Serialize and Deserialize, implemented in terms of the same as_py_object module

(1) is transparent on the struct as it requires no changes to field types, whereas (2) is finer grained and can be applied like things: HashMap<String, SerdePyObject<MyThing>>.

In terms of implementation, serialization works as follows:

  1. Borrow a pointer from the PyObject
  2. Write a struct with a specially named [u8] field like { "__pyobject_ptr": [ /* 8 pointer bytes go here */ ] }

Deserialization works by looking for the specially named field and treating it as a pointer.

jmrgibson commented 2 years ago

I would certainly appreciate this. My use case is as follows:

  1. Define a rust enum that can be exposed to python using the pyclass attribute
  2. Include that enum in a rust struct
  3. Use pythonize to convert struct to python dict

Example:

use pyo3::prelude::*;
#[pyclass]
enum E {
    Foo,
    Bar,
}

#[derive(Serialize)]
struct S {
    other_field: String
    enum_field: E,
}

let s = S {
    other_field: "potato".to_owned(),
    enum_field: E:Foo,
}
pythonize::pythonize(py, &s)

Then in python, this results in a str for the enum field, which can't be compared with the strongly-typed E.Foo.

out = {'other_field': 'potato', 'enum_field': 'Foo'}

from rust_module import E
E.Foo == out['enum_field'] # False :(
davidhewitt commented 2 years ago

At the moment I'm too busy to look into this myself; contributions are welcome. The proposal by @1tgr sounds like it could work as an opt-in mechanism!

(I think to have something work automatically would require specialization, though I haven't thought too hard!)