Munksgaard / session-types

MIT License
550 stars 21 forks source link

Independent feedback while working on the same problem #60

Open WildCryptoFox opened 5 years ago

WildCryptoFox commented 5 years ago

Hello. I am currently experimenting with BSTs in Rust and hopefully soon MPSTs. I'll be publishing the code soon. The following thread covers different design decisions relating to the internal implementation without changing the interface for consumers. When I move on to MPST, I'll experiment with a DSL using macros.

  1. Use a union over all message variants and rely on the state machine to select the desired type to send/recv. Vs. Transmuting a Box-pointer on the crossbeam-channel.

Note: For safety, the union will can trivially be replaced with an enum. Currently rust's enum representations aren't quite as nice as they could be, but this should improve in the future - further reducing the cost of tagging each variant. Importantly we do not need unsafe to implement session types. The space inefficient enum tag won't be much more than the optimized choice-based tag, and it is already significantly cheaper than boxing all messages.

  1. Backend agnostic. I.e. A #[no_std] context can use a core::cell::Cell<Option<UnionOfMessages>> as the underlying channel without worrying about allocations.

  2. Choices can send a single message, and if followed with a send; the two messages may be bundled. Vs. Sending each boolean choice over the channel.

  3. Although Rust doesn't have linear types, it does have private constructors - which I conject in combination with affine types is sufficient to force the protocol to progress through at least one valid path. Replacing the unit in the E type parameter of a Chan with a public contextual type, with a private constructor instantiating it, the relevant function can be forced to return the closed channel to its caller.

Do we need anything stronger than requiring the program progress through at least one valid path?

Proof of concept:

mod private {
    pub struct Start(());
    pub struct End(());

    impl Start {
        pub fn finish(self) -> End {
            End(())
        }
    }

    pub fn require_completion(f: impl Fn(Start) -> End) {
        f(Start(()));
    }

    pub fn require_completion_mut(mut f: impl FnMut(Start) -> End) {
        f(Start(()));
    }
}
use self::private::{require_completion, require_completion_mut};

fn main() {
    // The closure is given a Start, but it cannot create another.
    // It is required to return an End, but it can only acquire one by consuming
    // the Start with a call to Start::finish.
    require_completion(|s| s.finish());

    // Good! These fail as expected.
    //require_completion(|_| private::Start(()).finish());
    //require_completion(|_| private::End(()));

    // Caveat: This misuse of channels is prevented by Fn, but not FnMut.
    // However, an actor defined without the visibility of its caller removes
    // this particular risk too.
    require_completion(|mut s| {
        require_completion_mut(|s2|
            core::mem::replace(&mut s, s2).finish()
        );
        s.finish()
    });
}
Munksgaard commented 5 years ago

Hi @james-darkfox, thanks for sharing your thoughts with us!

1. Use a union over all message variants and rely on the state machine to select the desired type to send/recv. Vs. Transmuting a Box-pointer on the crossbeam-channel.

That's an interesting idea, getting rid of unsafe is definitely desirable. However, I'm unsure how that differs from just having normal crossbeam channels with that union type as the type argument? @wenkokke has an alternative approach in their library that you might want to take a look at as well: https://github.com/wenkokke/rusty-variation. In short, she's sending a new transmitter/receiver every time she does something. I believe she also avoids using unsafe.

2. Backend agnostic. I.e. A `#[no_std]` context can use a `core::cell::Cell<Option<UnionOfMessages>>` as the underlying channel without worrying about allocations.

That's cool, and something we haven't considered at all. I wonder what it would take to make our library work in a #[no_std] environment.

3. Choices can send a single message, and if followed with a send; the two messages may be bundled. Vs. Sending each boolean choice over the channel.

Interesting thought! Perhaps we could do something similar with specialized trait implementations.

4. Although Rust doesn't have linear types, it does have private constructors - which I conject in combination with affine types is sufficient to force the protocol to progress through at least one valid path. Replacing the unit in the `E` type parameter of a `Chan` with a public contextual type, with a private constructor instantiating it, the relevant function can be forced to return the closed channel to its caller.

This is something that we spent a long time thinking about. When we were originally implementing this library, I believe that was one of the things we tried out, but we couldn't quite reach a good compromise between safety and usability. Lots have changed in Rust in the meantime though, so perhaps it's time to revisit such an approach. I remember some of the difficulties were: sending channels over channels, intermixing channels, and just general user inconvenience. @laumann perhaps your memory is better than mine on this point?

WildCryptoFox commented 5 years ago

Per (2) using the union with crossbeam's channels is an option. The important detail there is that we can use an enum without questioning safety and that Rust's enum representations are ripe for reducing its costs; and that we don't need to use boxes.

Here is a dump of my current code.

The Atm example resolves to use Cons<&'static str, Cons<usize, ()>>, flattening out the tree and removing duplicates to enable the type checker to infer its index in the union-set. This version of the code uses union, not enum; but it is trivial to swap out to remove the unsafety. Flattening and reducing the variants would keep the enum discriminant at its minimal global state.

I wouldn't use the union variant in real code as even with a large protocol the number of unique types won't be too great and the enum will likely be padded anyway.

I'm not yet sending the choices as numbers, but they are prepared at the type level in the form S<S<S<Z>>> for 3, and S<SS<Z>> for (0,1). This representation favors choices to-the-right instead of trees. If trees are hot (I doubt) then you may use L<N> / R<N> to represent the route instead.

Before implementing the union-set I had actually used the left/right routing to index the types directly. This yields less code but its non-unique set is not optimal in the enum form.

No specialization needed; although the unique-set does require #![feature(untagged_unions, optin_builtin_traits)]. No unstable features were necessary for the left/right routing.

If you're dealing with multiple channels, you'll probably want to use MPSTs instead of BSTs.

After playing with the trait approach for a couple of days, I've decided to leave that code as-is and try another approach. Using procedural macros to both specify MPSTs and to implement actors. I'll keep you posted. :-)

WildCryptoFox commented 5 years ago

Heh. After adding the reference to the rust issue for enum representations. I see someone else has done similar work (and probably know the names of these types better). If you go this route, consider frunk's Coproducts. AFAICT there isn't a tree -> unique cons conversion there but that'd be easy to add (at the cost of the unstable feature optin_builtin_traits .. or possibly without but with noisier implementations)

WildCryptoFox commented 5 years ago

Feedback ext.

  1. Promote newtypes to avoid accidentally using one BST as another with the same topology but distinct semantics.

  2. Use type Dual<T> = <T as HasDual<T>>::Dual; to lower the associated type noise; using as Dual<MyProto>.