talaria-im / talaria

<insert buzzwords> instant messaging application
Mozilla Public License 2.0
6 stars 0 forks source link

Language interop #7

Open expectocode opened 3 years ago

expectocode commented 3 years ago

We've been discussing this a lot!

A few things we've been thinking about revolve around how the UI and the async network "backend" should speak to each other. One way would be to have a lot of FFI methods that the (kotlin) UI can call into the Rust (still on the UI side, maybe in a short-lived thread), which would then push a message onto a queue for the backend to process.

Another way would be to have the UI side serialize the message in Kotlin, send this serialised message to one single Rust method (still on the UI side) that de-serializes the message and pushes a message to the backend as before.

Hand-drawn diagram, legibility not guaranteed. image

expectocode commented 3 years ago

@9ary @Lonami please feel free to weigh in, too

expectocode commented 3 years ago

We've also been discussing FFI for the message passing, or using sockets. I experimented with both and found a lot of annoyances with unix sockets:

expectocode commented 3 years ago

Also, sockets are really byte streams and that's not a super pleasant interface for just trying to pass messages in.

I enjoyed using channels a lot more, once I actually got them working (with lots of input from @Lonami). Here's a guided tour of my initial socket message passing proof of concept (available at https://github.com/expectocode/android-rust-experiments/commit/a2ff41e534a2d7ebab9fa564c844debad96bde33):

Android startup code (yes i know we should use a Service not an activity) image This calls the Rust listen() function in a dedicated thread. listen() looks like this: image

We can see evidence of this running in the logs: image

The next relevant thing that happens is the button press handler: image

This is the key part - UI trying to send a message to the async backend.

So what does sendMsg do? image Yep, it ignores the argument because I haven't yet understood JNI enough to get a byte array from java into rust.

But we can see that we do send a message - "heyyoooo". And this message is received by the listen() loop!

image You can actually tell by the pattern of numbers that it's the "heyyoooo" string, too. Success.

expectocode commented 3 years ago

Using OnceCell for this is also a bit questionable. lonami suggests maybe we actually want mutex<option>.

expectocode commented 3 years ago

@expectocode While typing this i realised i should really be looking at the primitive types, array of byte

expectocode commented 3 years ago

@expectocode https://docs.rs/jni/0.19.0/jni/struct.JNIEnv.html#method.get_array_elements

Lonami commented 3 years ago

WRT ignoring arguments, isn't the input: JObject the string that you passed? Surely there should be a way to interpret it as a JString? I'm not sure if you need the environment for that.

expectocode commented 3 years ago

@Lonami The way to take input: JString as a rust String would be let input: String = env .get_string(input) .expect("Couldn't get Java string!") .into();

But in any case, I think we should actually pass a byte array from the Kotlin side since we'll want to serialise arbitrary data into these messages

expectocode commented 3 years ago

Alright, I've now figured out how to actually use the Kotlin message contents in sendMsg. Resources that were helpful here were the jni-rs chat room (gitter/matrix), where I found a reference to a project using serialisation a lot: https://github.com/exonum/exonum-java-binding/blob/a6627aa00d10bc8175d68392e9156fbf99f48bfe/exonum-java-binding/core/rust/src/testkit/mod.rs#L79.

This led me to jbyteArray and env.convert_byte_array, which made the rest of the work pretty simple. It's up at https://github.com/expectocode/android-rust-experiments/commit/e272e29c9427b83874b1e4783c3d062aaa10b31c.

expectocode commented 3 years ago

Next up is communication going in the other direction. This could work using long polling for receiving - UI continually calls blocking recv() call in a short-lived thread. Sending should be pretty trivial as Rust just needs to put something into a channel.

expectocode commented 3 years ago

credit due, none of this is my design. i'm just implementing it

expectocode commented 3 years ago

We also still need to think about higher-level problems, like how UI makes backend calls that will return data needed for UI updates. example:

User taps on a chat participant to see their profile. This needs to change the UI to display the profile, and needs to make some backend calls to get profile information.

our solution here is going to have to be heavily informed by the way Jetpack Compose works.

9ary commented 3 years ago

@expectocode single activity navigation in jetpack, compose-specific information about the above

for showing a user profile you'd navigate() to the profile route, which pulls the profile composable into view, and attaches a viewmodel to it

9ary commented 3 years ago

see also https://developer.android.com/jetpack/compose/state on how compose and viewmodels tie together

9ary commented 3 years ago

I've been thinking about this long and hard, and after studying a few systems this is what I've come up with. We need an IDL to describe the boundary between both languages. Syntax should be fairly straightforward, can take inspiration from other IDLs and also programming languages in general).

Type system, mostly stolen from Thrift

? means inclusion is TBD/not an initial goal

Traits and trait objects

This is the primary mechanism for languages to call into each other. Call it a "Foreign Object Interface" if you will. This is what really sets this idea apart from other RPC systems, where you define singleton services with "static" methods, and references to "remote" objects are not first class. Traits are basically Rust traits, aka interfaces in Kotlin/Java, though a bit simplified. A trait can define a number of methods, and a language-native struct/class can implement an IDL trait for the other side to own instances of it. Trait object life-cycle:

Other details:

Memory layout of encoded types

These rules apply for data that's passed over the pipe. Languages that can make direct use of these representations should go ahead and do that, otherwise conversion is required, not a big deal. Hopefully this should at least avoid serializing/marshaling data only to deserialize it immediately, only one conversion should be necessary. This is designed with single process interop or shared memory IPC in mind, rather than sending data over the network. Also we drop deterministic data layout to take advantage of the C ABI instead, because the intended use is for co-developed programs running on the same machine to interface together. Maintaining ABI stability is still feasible but out of scope for now, and we don't have to worry about differences between platforms.

Why roll our own solution?

Anything else?

I've probably forgotten something important, feel free to comment and criticize.

bb010g commented 3 years ago

[My responses are in quotes, to deal with GFM not supporting lifting quote levels while maintaining list levels.]

WIP response: not yet finished. Wanted to get this posted in its current state to start.

I've been thinking about this long and hard, and after studying a few systems this is what I've come up with. We need an IDL to describe the boundary between both languages. Syntax should be fairly straightforward, can take inspiration from other IDLs and also programming languages in general).

I don't think this needs to be our own IDL (serializing to bytestrings), and will be comparing a few existing IDLs that serialize to bytestrings: Protocol Buffers (Protobufs) (proto3), Apache Thrift, External Data Representation (XDR) (RFC 4506), and Cap'n Proto.

Type system, mostly stolen from Thrift

? means inclusion is TBD/not an initial goal

Notation:

  • x y = xy. (juxatposition)
  • {x, y} z = x z, y z; x {y, z} = x y, x z. (juxaposition)
  • ¤(¤ ¤) = (λx. x x); ¤(¤2 ¤1) = (λx. λy. y x).

Traits and trait objects

This is the primary mechanism for languages to call into each other. Call it a "Foreign Object Interface" if you will. This is what really sets this idea apart from other RPC systems, where you define singleton services with "static" methods, and references to "remote" objects are not first class. Traits are basically Rust traits, aka interfaces in Kotlin/Java, though a bit simplified. A trait can define a number of methods, and a language-native struct/class can implement an IDL trait for the other side to own instances of it. Trait object life-cycle:

Other details:

Memory layout of encoded types

These rules apply for data that's passed over the pipe. Languages that can make direct use of these representations should go ahead and do that, otherwise conversion is required, not a big deal. Hopefully this should at least avoid serializing/marshaling data only to deserialize it immediately, only one conversion should be necessary. This is designed with single process interop or shared memory IPC in mind, rather than sending data over the network. Also we drop deterministic data layout to take advantage of the C ABI instead, because the intended use is for co-developed programs running on the same machine to interface together. Maintaining ABI stability is still feasible but out of scope for now, and we don't have to worry about differences between platforms.

Why roll our own solution?

Feature proto3 Thrift XDR Cap'n Proto
Zero-copy No Currently, no, but protocol-dependent Yes Yes
First-class object references (RPC) N/A No N/A Yes
First-class peer heap object references (RPC) N/A No N/A Currently, no, but possible

Cap'n Proto RPC viably supporting inproc, shared memory vat networks was pretty unexpected. rpc.capnp defines the official RPC system.

Cap'n Proto RPC takes place between "vats". A vat hosts some set of objects and talks to other vats through direct bilateral connections. Typically, there is a 1:1 correspondence between vats and processes (in the unix sense of the word), although this is not strictly always true (one process could run multiple vats, or a distributed virtual vat might live across many processes).

An example VatNetwork interface sketch is given at the end of the file, and nothing in that relies upon traditional connection streams actually existing either. All that's necessary basically is a way to identify vats, ways to connect to & accept connections from vats, and then ways to send & recieve messages on connections. Magically fast, zero-copy, shared memory message-passing is perfectly legal. Plus, I think we could get away with a decently simple implementation.

9ary commented 3 years ago

@bb010g has motivated me to look further into it, and it looks like capnproto pretty much does what we need! We may need to mess around with the Rust implementation a bit, and the Java implementation should work to start with (though it's missing any RPC support), and a Kotlin/Multiplatform port shouldn't be too difficult to implement.

9ary commented 3 years ago

Things I've been looking at: