Open expectocode opened 3 years ago
@9ary @Lonami please feel free to weigh in, too
We've also been discussing FFI for the message passing, or using sockets. I experimented with both and found a lot of annoyances with unix sockets:
nix
(unless there's a way to use nix's UnixAddr with stdlib sockets)Also, sockets are really byte streams and that's not a super pleasant interface for just trying to pass messages in.
I enjoyed using channels a lot more, once I actually got them working (with lots of input from @Lonami). Here's a guided tour of my initial socket message passing proof of concept (available at https://github.com/expectocode/android-rust-experiments/commit/a2ff41e534a2d7ebab9fa564c844debad96bde33):
Android startup code (yes i know we should use a Service not an activity) This calls the Rust listen() function in a dedicated thread. listen() looks like this:
We can see evidence of this running in the logs:
The next relevant thing that happens is the button press handler:
This is the key part - UI trying to send a message to the async backend.
So what does sendMsg do? Yep, it ignores the argument because I haven't yet understood JNI enough to get a byte array from java into rust.
But we can see that we do send a message - "heyyoooo". And this message is received by the listen() loop!
You can actually tell by the pattern of numbers that it's the "heyyoooo" string, too. Success.
Using OnceCell for this is also a bit questionable. lonami suggests maybe we actually want mutex<option
@expectocode While typing this i realised i should really be looking at the primitive types, array of byte
WRT ignoring arguments, isn't the input: JObject
the string that you passed? Surely there should be a way to interpret it as a JString
? I'm not sure if you need the environment for that.
@Lonami The way to take input: JString
as a rust String would be let input: String = env .get_string(input) .expect("Couldn't get Java string!") .into();
But in any case, I think we should actually pass a byte array from the Kotlin side since we'll want to serialise arbitrary data into these messages
Alright, I've now figured out how to actually use the Kotlin message contents in sendMsg. Resources that were helpful here were the jni-rs chat room (gitter/matrix), where I found a reference to a project using serialisation a lot: https://github.com/exonum/exonum-java-binding/blob/a6627aa00d10bc8175d68392e9156fbf99f48bfe/exonum-java-binding/core/rust/src/testkit/mod.rs#L79.
This led me to jbyteArray and env.convert_byte_array, which made the rest of the work pretty simple. It's up at https://github.com/expectocode/android-rust-experiments/commit/e272e29c9427b83874b1e4783c3d062aaa10b31c.
Next up is communication going in the other direction. This could work using long polling for receiving - UI continually calls blocking recv() call in a short-lived thread. Sending should be pretty trivial as Rust just needs to put something into a channel.
credit due, none of this is my design. i'm just implementing it
We also still need to think about higher-level problems, like how UI makes backend calls that will return data needed for UI updates. example:
User taps on a chat participant to see their profile. This needs to change the UI to display the profile, and needs to make some backend calls to get profile information.
our solution here is going to have to be heavily informed by the way Jetpack Compose works.
@expectocode single activity navigation in jetpack, compose-specific information about the above
for showing a user profile you'd navigate()
to the profile route, which pulls the profile composable into view, and attaches a viewmodel to it
see also https://developer.android.com/jetpack/compose/state on how compose and viewmodels tie together
I've been thinking about this long and hard, and after studying a few systems this is what I've come up with. We need an IDL to describe the boundary between both languages. Syntax should be fairly straightforward, can take inspiration from other IDLs and also programming languages in general).
? means inclusion is TBD/not an initial goal
This is the primary mechanism for languages to call into each other. Call it a "Foreign Object Interface" if you will. This is what really sets this idea apart from other RPC systems, where you define singleton services with "static" methods, and references to "remote" objects are not first class. Traits are basically Rust traits, aka interfaces in Kotlin/Java, though a bit simplified. A trait can define a number of methods, and a language-native struct/class can implement an IDL trait for the other side to own instances of it. Trait object life-cycle:
Other details:
These rules apply for data that's passed over the pipe. Languages that can make direct use of these representations should go ahead and do that, otherwise conversion is required, not a big deal. Hopefully this should at least avoid serializing/marshaling data only to deserialize it immediately, only one conversion should be necessary. This is designed with single process interop or shared memory IPC in mind, rather than sending data over the network. Also we drop deterministic data layout to take advantage of the C ABI instead, because the intended use is for co-developed programs running on the same machine to interface together. Maintaining ABI stability is still feasible but out of scope for now, and we don't have to worry about differences between platforms.
I've probably forgotten something important, feel free to comment and criticize.
[My responses are in quotes, to deal with GFM not supporting lifting quote levels while maintaining list levels.]
WIP response: not yet finished. Wanted to get this posted in its current state to start.
I've been thinking about this long and hard, and after studying a few systems this is what I've come up with. We need an IDL to describe the boundary between both languages. Syntax should be fairly straightforward, can take inspiration from other IDLs and also programming languages in general).
I don't think this needs to be our own IDL (serializing to bytestrings), and will be comparing a few existing IDLs that serialize to bytestrings: Protocol Buffers (Protobufs) (proto3), Apache Thrift, External Data Representation (XDR) (RFC 4506), and Cap'n Proto.
? means inclusion is TBD/not an initial goal
Notation:
x
y
=xy
. (juxatposition)- {x, y} z = x z, y z; x {y, z} = x y, x z. (juxaposition)
- ¤(¤ ¤) = (λx. x x); ¤(¤2 ¤1) = (λx. λy. y x).
Primitive types:
Rust proto3 Thrift XDR Cap'n Proto ( bool
)( bool
)( bool
)( bool
)( Bool
)( i
{8
,16
,32
,64
})(Void, Void, ¤( int
¤ +sint
¤ +sfixed
¤){32
,64
})( i
{8
,16
,32
,64
})(Void, Void, integer
,hyper integer
)( Int
{8
,16
,32
,64
})( u
{8
,16
,32
,64
})(Void, Void, ¤( uint
¤ +fixed
¤){32
,64
})( byte
, Void, Void, Void)(Void, Void, unsigned
{`,
hyper}
integer`( UInt
{8
,16
,32
,64
})( f
{32
,64
,128
})( float
,double
, Void)(Void, double
, Void)( float
,double
,quadruple
)( Float
{32
,64
}, Void)
Containers:
Rust proto3 Thrift XDR Cap'n Proto ( String
)( string
)( string
)( string
identifier<
m>
) (*) (**)( Text
)( Vec<u8>
,Vec<T>
)( bytes
,repeated T
)( binary
+list<byte>
,list<T>
)( opaque
identifier<
m>
, type-name`_identifier_
<_m_
>`) (*)( Data
+List(UInt8)
,List(T)
)( [u8; N]
,[T; N]
)(Void, Void) (Void, Void) ( opaque
identifier[N]
,T
identifier[N]
)(Void, Void) (?) ( HashMap<K, V>
)( map<K, V>
) (***)( map<K, V>
)( Void
)(Void, Void) (****) (?) ( HashSet<T>
)( Void
)( set<T>
)( Void
)Note that
Set(k)
is isomorphic toMap(k, Bool)
. Note thatSet(k)
is isomorphic toList(k)
where eachk
is unique and list item order is forgotten. Note thatMap(k, v)
is isomorphic toSet((k, v))
where each tuple is unique byk
.(*): m is an optional maximum byte count, defaulting to (2**32) - 1.
(**): Officially, an ASCII string, but
string identifier<m>
is encoded exactly the same asopaque identifier<m>
, permitting non-compliance via UTF-8.(***):
map<K, V> my_map = N;
is equivalent to the following, where each entry is unique bykey
and entry repetition order is unstable:message MyMapEntry { K key = 1; V value = 2; } repeated MyMapEntry my_map = N;
(****): No
Map(K, V)
is provided by default, but the documentation recommends the generic declaration:struct Map(Key, Value) { entries @0 :List(Entry); struct Entry { key @0 :Key; value @1 :Value; } }
Composite types:
Rust proto3 Thrift XDR Cap'n Proto ```rust struct Foo { bar: bool, baz: i32, } ``` ```thrift struct Foo { 1: bool bar, 2: i32 baz, } ``` ```proto message Foo { bool bar = 1; int32 baz = 2; } ``` ```xdr struct { bool bar; integer baz; } foo; ``` ```capnp struct Foo { bar @0 :Bool; baz @1 :Int32; } ```
structs
optional fields (Option
default values?
exceptions? (special structs for error handling)
enums
tagged unions? (can be mapped to both Rust and Kotlin constructs, but should probably be avoided)
Others:
constants
traits and trait objects (see below)
This is the primary mechanism for languages to call into each other. Call it a "Foreign Object Interface" if you will. This is what really sets this idea apart from other RPC systems, where you define singleton services with "static" methods, and references to "remote" objects are not first class. Traits are basically Rust traits, aka interfaces in Kotlin/Java, though a bit simplified. A trait can define a number of methods, and a language-native struct/class can implement an IDL trait for the other side to own instances of it. Trait object life-cycle:
methods can return trait objects
the runtime assigns the object an ID, and stashes it in a hashmap
the ID is passed to the other side of the channel
a wrapper object created
now the other side can call methods on the wrapper and the runtime will transparently forward those calls to the real object
the caller language is responsible for requesting object destruction when it's done
Other details:
Chicken and egg problem, how to obtain an object to call methods on at startup?
static methods could allow constructors to be called without any pre-existing instance
how are we even supposed to have static methods on a trait and not an implementor?
Method calls should be represented just like structs internally, but that should be hidden as an implementation detail
Method calls should be async, implemented with coroutines on both sides of the channel
assign a serial number to the call
put it in the pipe, await result
other side picks it up, calls async implementation of the method
return value is put into the pipe with the serial number
the promise/future is fulfilled, caller gets the return value
Everything is pass by value, use move semantics where possible to avoid unnecessary copies
Needless to say we're aiming for memory and thread safety, hopefully this design makes that easy to implement
These rules apply for data that's passed over the pipe. Languages that can make direct use of these representations should go ahead and do that, otherwise conversion is required, not a big deal. Hopefully this should at least avoid serializing/marshaling data only to deserialize it immediately, only one conversion should be necessary. This is designed with single process interop or shared memory IPC in mind, rather than sending data over the network. Also we drop deterministic data layout to take advantage of the C ABI instead, because the intended use is for co-developed programs running on the same machine to interface together. Maintaining ABI stability is still feasible but out of scope for now, and we don't have to worry about differences between platforms.
strings should be UTF-8
strings and containers:
in the pipe, pointers to/IDs of runtime objects
the implementations should see native objects
structs should use the C layout
enums should be encoded as a 32 bit integer, explicit values are allowed, automatic assignment when none is set
No existing solution for FFI bindings handles Rust/Java very well, Rust/Kotlin pretty much unheard of
Kotlin/JVM and Kotlin/Native have very different FFI systems (JNI vs bespoke traditional FFI design)
Existing RPC systems are not designed for language interop in a single process (at least not specifically), they include features that are completely superfluous for us (like schema/encoding stability) and neglect some of our needs (zero-copy data movement, first class references to objects on the other side of the interface)
Feature proto3 Thrift XDR Cap'n Proto Zero-copy No Currently, no, but protocol-dependent Yes Yes First-class object references (RPC) N/A No N/A Yes First-class peer heap object references (RPC) N/A No N/A Currently, no, but possible Cap'n Proto RPC viably supporting inproc, shared memory vat networks was pretty unexpected.
rpc.capnp
defines the official RPC system.Cap'n Proto RPC takes place between "vats". A vat hosts some set of objects and talks to other vats through direct bilateral connections. Typically, there is a 1:1 correspondence between vats and processes (in the unix sense of the word), although this is not strictly always true (one process could run multiple vats, or a distributed virtual vat might live across many processes).
An example
VatNetwork
interface sketch is given at the end of the file, and nothing in that relies upon traditional connection streams actually existing either. All that's necessary basically is a way to identify vats, ways to connect to & accept connections from vats, and then ways to send & recieve messages on connections. Magically fast, zero-copy, shared memory message-passing is perfectly legal. Plus, I think we could get away with a decently simple implementation.
@bb010g has motivated me to look further into it, and it looks like capnproto pretty much does what we need! We may need to mess around with the Rust implementation a bit, and the Java implementation should work to start with (though it's missing any RPC support), and a Kotlin/Multiplatform port shouldn't be too difficult to implement.
Things I've been looking at:
We've been discussing this a lot!
A few things we've been thinking about revolve around how the UI and the async network "backend" should speak to each other. One way would be to have a lot of FFI methods that the (kotlin) UI can call into the Rust (still on the UI side, maybe in a short-lived thread), which would then push a message onto a queue for the backend to process.
Another way would be to have the UI side serialize the message in Kotlin, send this serialised message to one single Rust method (still on the UI side) that de-serializes the message and pushes a message to the backend as before.
Hand-drawn diagram, legibility not guaranteed.