matsadler / magnus

Ruby bindings for Rust. Write Ruby extension gems in Rust, or call Ruby from Rust.
https://docs.rs/magnus/latest/magnus/
MIT License
682 stars 35 forks source link

Working with nested structs #83

Closed LukeMathWalker closed 1 year ago

LukeMathWalker commented 1 year ago

Hi! I'm exploring the idea of offloading to a Rust native extension the serialization work in a Ruby application. Everything works alright for struct whose fields are primitive values and/or "standard" types (e.g. String). I start struggling when it comes to nested structs, e.g.:

#[magnus::wrap(class = "WireRepresentation")]
#[derive(serde::Serialize)]
struct WireRepresentation {
    address: Address,
    name: String
}

impl WireRepresentation {
    pub fn new(address: &Address, name: String) -> Self {
        Self {
            address: address.to_owned(),
            name
        }
    }
    pub fn to_json_str(&self) -> String {
        serde_json::to_string(&self).unwrap()
    }
}

#[magnus::wrap(class = "Address")]
#[derive(serde::Serialize, Clone)]
struct Address {
    street: String,
    number: u64 
}

// [...]

The above compiles, but it performs an address.to_owned() invocation that I'd like to avoid. I was nudged in this direction by the compiler—my first draft took Address by value in WireRepresentation::new, but it resulted in the following error message:

error[E0277]: the trait bound `Address: TryConvert` is not satisfied
    --> src/lib.rs:42:41
     |
42   |     wire.define_singleton_method("new", function!(WireRepresentation::new, 2))?;
     |                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the trait `TryConvert` is not implemented for `Address`
     |
     = help: the following other types implement trait `TryConvert`:
               &T
               (T0, T1)
               (T0, T1, T2)
               (T0, T1, T2, T3)
               (T0, T1, T2, T3, T4)
               (T0, T1, T2, T3, T4, T5)
               (T0, T1, T2, T3, T4, T5, T6)
               (T0, T1, T2, T3, T4, T5, T6, T7)

Am I going in the right direction or is there a better way to handle this situation that doesn't require cloning the entire nested struct?

adampetro commented 1 year ago

You can avoid the address.to_owned() invocation by working with a magnus::typed_data::Obj wrapper around Address. Note that this will mean that you'll need to implement magnus::DataTypeFunctions::mark and mark the Obj<Address> to prevent the value from being garbage collected. Also, because Obj does not implement serde::Serialize, you can use #[serde(serialize_with)] and a helper method to serialize the data wrapped by the Obj.

Would something like this work?

#[derive(serde::Serialize, magnus::TypedData)]
#[magnus(class = "WireRepresentation", mark)]
struct WireRepresentation {
    #[serde(serialize_with = "serialize_obj")]
    address: magnus::typed_data::Obj<Address>,
    name: String,
}

impl magnus::DataTypeFunctions for WireRepresentation {
    fn mark(&self) {
        magnus::gc::mark(self.address);
    }
}

impl WireRepresentation {
    pub fn new(address: magnus::typed_data::Obj<Address>, name: String) -> Self {
        Self { address, name }
    }
    pub fn to_json_str(&self) -> String {
        serde_json::to_string(&self).unwrap()
    }
}

#[magnus::wrap(class = "Address")]
#[derive(serde::Serialize, Clone)]
struct Address {
    street: String,
    number: u64,
}

impl Address {
    fn new(street: String, number: u64) -> Self {
        Self { street, number }
    }
}

fn serialize_obj<T: serde::Serialize + magnus::TypedData, S: serde::Serializer>(
    obj: &magnus::typed_data::Obj<T>,
    serializer: S,
) -> Result<S::Ok, S::Error> {
    obj.get().serialize(serializer)
}

fn init_ruby() -> Result<(), magnus::Error> {
    let wire = magnus::define_class("WireRepresentation", Default::default())?;
    wire.define_singleton_method("new", magnus::function!(WireRepresentation::new, 2))?;
    wire.define_method(
        "to_json",
        magnus::method!(WireRepresentation::to_json_str, 0),
    )?;

    let address = magnus::define_class("Address", Default::default())?;
    address.define_singleton_method("new", magnus::function!(Address::new, 2))?;

    Ok(())
}
matsadler commented 1 year ago

Adam's example is probably your best path forward.

The wrap macro implements a bunch of code, that, when a struct is returned to Ruby it uses Ruby's rb_data_typed_object_wrap api to create a Ruby object that is a pointer to that struct. With Rust's ownership semantics this effectively makes Ruby the owner of the struct, and there's no good way to get the data back because then you'd leave the Ruby object that owned the struct hollowed out, with nothing to point to. So Magnus only lets you get back a reference to a wrapped struct.

typed_data::Obj<T> is that pointer object, from which you can get a &T. Like a reference it's Copy (as it's just a pointer) but you can treat Obj like an owned value (as long as you make sure it's reachable by the GC) so it's a bit more flexible, as you can see in Adam's example.

The above example would look a little different if you were using Magnus from the main branch, so just so you don't get tripped up in the future, here's the changes you would need to make with the next release of Magnus:

use magnus::{
    function, gc, method, prelude::*, typed_data::Obj, value::Opaque, DataTypeFunctions, Error,
    Ruby, TypedData,
};
use serde::{Serialize, Serializer};

#[derive(Serialize, TypedData)]
#[magnus(class = "WireRepresentation", mark)]
struct WireRepresentation {
    #[serde(serialize_with = "serialize_obj")]
    address: Opaque<Obj<Address>>,
    name: String,
}

impl DataTypeFunctions for WireRepresentation {
    fn mark(&self, marker: &gc::Marker) {
        marker.mark(self.address)
    }
}

impl WireRepresentation {
    pub fn new(address: Obj<Address>, name: String) -> Self {
        Self {
            address: address.into(),
            name,
        }
    }
    pub fn to_json_str(&self) -> String {
        serde_json::to_string(&self).unwrap()
    }
}

#[magnus::wrap(class = "Address")]
#[derive(Serialize, Clone)]
struct Address {
    street: String,
    number: u64,
}

impl Address {
    fn new(street: String, number: u64) -> Self {
        Self { street, number }
    }
}

fn serialize_obj<T: Serialize + TypedData, S: Serializer>(
    obj: &Opaque<Obj<T>>,
    serializer: S,
) -> Result<S::Ok, S::Error> {
    // we promise we're only ever going to call this on a Ruby thread
    let ruby = unsafe { Ruby::get_unchecked() };
    ruby.get_inner(*obj).serialize(serializer)
}

#[magnus::init]
fn init(ruby: &Ruby) -> Result<(), Error> {
    let wire = ruby.define_class("WireRepresentation", ruby.class_object())?;
    wire.define_singleton_method("new", function!(WireRepresentation::new, 2))?;
    wire.define_method("to_json", method!(WireRepresentation::to_json_str, 0))?;

    let address = ruby.define_class("Address", ruby.class_object())?;
    address.define_singleton_method("new", function!(Address::new, 2))?;

    Ok(())
}
LukeMathWalker commented 1 year ago

Thanks a lot to you both—I have a path forward now and a much better understanding of what is going on ♥️