Closed stephenlf closed 9 months ago
Current workaround I'm using. I'll write this up in a pull request later.
/// A markdown::Options wrapper that can be safely passed between threads.
/// Note that this struct cannot wrap Options with MDX parser functions
/// specified. In other words, `options.parse.mdx_ems_parse` and
/// `options.parse.mdx_expression_parse` must both be `None`.
///
/// # Examples
///
/// ```rust
/// use docgen::state::ThreadsafeOptions;
///
/// // Equivalent to `let options = ThreadsafeOptions::new(markdown::Options::default());`
/// let options = ThreadsafeOptions::default();
/// let app_state = Arc::new(options);
/// // `app_state` may now be used in a multi-threaded web server.
/// ```
#[derive(Debug, Default)]
pub struct ThreadsafeOptions(Options);
unsafe impl Send for ThreadsafeOptions {}
unsafe impl Sync for ThreadsafeOptions {}
impl ThreadsafeOptions {
/// Create a new `ThreadsafeOptions` wrapper around the provided `Options`
/// object.
///
/// **PANICS**: This function will panic if you attempt to instantiate a
/// new object with an MDX parser specified. In other words,
/// `options.parse.mdx_ems_parse` and `options.parse.mdx_expression_parse`
/// must both be `None`.
pub fn new(options: Options) -> Self {
if options.parse.mdx_esm_parse.is_some() || options.parse.mdx_expression_parse.is_some() {
panic!("Cannot instantiate markdown::Options with MDX parsers specified")
}
Self(options)
}
}
impl From<Options> for ThreadsafeOptions {
fn from(value: Options) -> Self {
Self::new(value)
}
}
impl Borrow<Options> for ThreadsafeOptions {
fn borrow(&self) -> &Options {
&self.0
}
}
Welcome @stephenlf! 👋
The Options struct is not Sync.
This is an XY Problem.
This is an issue in web servers because it means we must reallocate an Options struct every time we parse something.
Memory is allocated all over the place to do parsing and create a syntax tree. Allocation feels like a non issue here.
This library aims to be 100% pure safe rust, sync
is unsafe
https://doc.rust-lang.org/nomicon/send-and-sync.html
Marking a core part of the safe API unsafe
is a non-starter here.
Reading between the lines a bit you seem to be trying to reuse options across multiple threads in Rust? Is that accurate? If so why is allocating Options an issue for you?
I am incorporating a Markdown parser into an Axum web server I am writing. Rather than reallocate an markdown::Options
object every time the parser endpoint is called, I had hoped to save on memory by instantiating a single, immutable markdown::Options
object, wrapping it in Arc
, and passing it in to the router's axum::State
. To do this, the markdown::Options
object would have to be Send
/Sync
, since application state may be passed between threads.
I understand not wanting any unsafe
blocks in the codebase. "100% safe Rust" is a remarkable claim. My workaround is sufficient for my use case. And it sounds like the amount of allocation going into an Options
struct is trivial compared to the work being done by the parser, so there's probably no need for me to worry about it. However, if you do want to create a thread-safe, "100% safe Rust" Options
implementation, I believe there are ways to do it without unsafe
code blocks.
I realize now it was presumptuous of me to say that I'd "set up a pull request later." I forgot this wasn't one of my projects! Let me know if you'd like me to explore this issue more.
Other than safe, this project is also no_std + alloc. It’s incredibly basic, Box and Vec but that’s it. Send/Sync/Arc etc is all much higher level. And indeed, the bottle neck here is parsing documents. I’d assume all energy is better spent there. I’d also guess that Rust could figure this out itself if there were no mdx functions? Not sure if a thread safe version should just omit those?
I’d also guess that Rust could figure this out itself if there were no mdx functions? Not sure if a thread safe version should just omit those?
That's the approach I took in my codebase. However, I suspect that annotating the MDX functions something like Sync + Send
would also solve the problem without limiting the MDX functionality too much. I haven't explored it very much, but something like this might work:
// markdown-rs/src/util/mdx.rs
// ...
pub type EsmParse = dyn Fn(&str) -> Signal + Sync + Send;
// ...
pub type ExpressionParse = dyn Fn(&str, &ExpressionKind) -> Signal + Sync + Send;
// ...
This would of course mean that users couldn't dynamically create their MDX parsers, though I can't see this being a practical issue for most use cases. This code compiles and passes all checks, though I'm not sure what other downstream effects something like this might have.
One can imagine a ThreadsafeOptions
struct that is the same as the Options
struct with these additional constraints.
without limiting the MDX functionality too much
I have some doubts (but unfounded). If it works with https://github.com/wooorm/mdxjs-rs I’d be open to it.
Anyway, closing this for now. The safe part of this project is very important. It also seems like there is a tiny fraction of time spent on this, we can probably spend our time better on parsing memory/performance. If all of that is addressed and it works with mdxjs-rs, I’d be open to a PR and rediscussing it!
Best! :)
The
Options
struct is notSync
. This is an issue in web servers because it means we must reallocate an Options struct every time we parse something.I will work on this problem later.
Full compiler diagnostic: