matsadler / magnus

Ruby bindings for Rust. Write Ruby extension gems in Rust, or call Ruby from Rust.
https://docs.rs/magnus/latest/magnus/
MIT License
682 stars 35 forks source link

Add value (de)serialization #47

Closed georgeclaghorn closed 1 year ago

georgeclaghorn commented 1 year ago

napi is a library that provides Rust-Node.js bindings. It provides the function to_js_value, enabled by a feature flag, which uses Serde to “serialize” any Rust type into a JS value. A complementary function, from_js_value, deserializes properly-formatted JS values back into Rust types. We can do the same thing for Ruby.

Here’s why I’m interested in this feature. I’m working on Classmate, a library of Ruby bindings to the Lightning CSS Rust crate. On Node.js, Lightning CSS offers a powerful custom transforms feature. It traverses a parse tree for a stylesheet and calls user-provided functions for each node. The provided functions can return replacement nodes. Under the hood, the syntax nodes are converted from Rust structs to JS objects using napi to_js_value. The returned substitute nodes are converted from JS objects back to Rust structs using from_js_value.

I would like to implement an equivalent feature in Classmate. I’d very much prefer not to have to create a Typed Data wrapper or build my own Hash for every type of syntax node. There are quite a lot, because CSS is a complex language, and they change frequently as Lightning CSS gains support for new CSS features.

This PR introduces a new Magnus feature flag named serde. When the serde feature is enabled, two new functions are exposed: magnus::value::serialize and magnus::value::deserialize. serialize takes any Rust type that implements serde::Serialize and converts it into a Ruby approximation. deserialize turns a Ruby value back into a Rust type that implements serde::Deserialize. See the included docs for serialize and deserialize to understand how they translate between Rust and Ruby.

georgeclaghorn commented 1 year ago

Thanks @matsadler! I started on this over at georgeclaghorn/serde-magnus, incorporating your feedback.

However, I’m fleshing out the tests (georgeclaghorn/serde-magnus@d0bc6ba), and I’m getting some sporadic failures that suggest Serde is moving Values to the heap. I haven’t looked more closely yet, but I do see some suspicious Box and Vec usage in Serde (example), so this experiment might be over.

matsadler commented 1 year ago

The errors you're seeing might be down to Rust's test runner. It runs tests in parallel in threads, which Ruby really doesn't like. It might mostly work if you put the cleanup object in a mutex, so the tests run one at a time, but the rules Ruby lays out for calling init (once, and in a stack frame above whatever is calling Ruby) are basically impossible to follow in the Rust unit tests. I'm not sure how much Ruby really cares about you following those rules.

In Magnus I've worked around this using Rust's 'integration' style tests, with one test method per file. Each file will be run as a separate process, so if you only have one test per file, then Ruby is happy.

Magnus also makes extensive use of doctests, as each doctest is compiled and run separately.

I think the nextest runner might run each test as a separate process, so that might be worth a look too (I've not seriously investigated for Magnus it as it didn't support doctests when I last looked).


It might be possible to work around Serde putting stuff on the heap.

You could do something like create a new Ruby Array at the start of the serialize function, and make sure this is passed through to all the serialisation functions, then every Ruby object you create you also add to the array. That way even if Serde moves that object to the heap there's still a route through the array for the GC to find it. Then at the end of the serialize function hopefully all the Boxes and Vecs should have gone away, the result is a full tree of Ruby objects, and you just discard the array.

georgeclaghorn commented 1 year ago

Switching to integration tests resolved my issues.

Regarding Serde possibly moving values on the heap: I think this is possibly not an issue in practice because each value Serde might temporarily Box ends up back on the stack before we call into Ruby again. Let me know if I’m wrong about that. Tracking intermediate values in a Ruby Array is a great idea if it’s necessary.

I published v0.1.0 of serde_magnus. Thanks again for your help.