ta0kira / zeolite

Zeolite is a statically-typed, general-purpose programming language.
Apache License 2.0
18 stars 0 forks source link

Support passing of C++ data between C++ extensions without requiring marshaling. #202

Closed ta0kira closed 2 years ago

ta0kira commented 2 years ago

It seems like the main issue is going to be with allowing the Zeolite code to construct an object that then needs to be passed to the framework.


A good example use-case might be gRPC+protobuf.

  1. protoc+plugins generate C++ from .proto files that are meant to be used with a gRPC RTL.
  2. To make a gRPC call on the client side, you need to construct a request protobuf and pass it to the gRPC framework.
    • If the Zeolite value for the request is constructed independently of the gRPC call then the former needs to somehow get unwrapped/transposed to pass to the gRPC framework. Three possibilities:
      1. Construct a gRPC "transaction" object that contains both request and response and transpose the existing request.
      2. Just require the request to be owned by a transaction object that contains both request and response.
      3. Add a virtual function to TypeValue to return the value as a void*, e.g., virtual void* AsPointer() const.
    • Should be able to read response without marshaling, via wrapper functions.
  3. To handle a gRPC call on the server side, you get references to both the request and a default response.
    • Should be able to read request without marshaling, via wrapper functions.
    • Same for constructing the response, since the framework constructs it ahead of time.

Libraries with general matrix operations are also a good example, e.g., glm and Eigen. In those cases, it's unrealistic to expect the Zeolite code to anticipate all required matrices ahead of time in order to instantiate them all in a single object. This means that we probably do need something like AsPointer() to avoid marshaling.


The AsPointer() approach should actually be fairly safe as long as:

  1. AsPointer() is only called on TypeValue that were passed by generated C++, to ensure that the compiler has done type checking.
  2. AsPointer() implementations return a pointer of the exact same type every time and explicitly use static_cast. Specifically in the case of protobuf, we need to be careful of the difference between MyMessage* and the common base class Message*.
  3. AsPointer() calls cast to the exact same type every time, and that type matches the implementations that are allowable in Zeolite code.

This means that it's very important for implementations and callers to essentially be written together.

  1. If an implementation returns MyMessage* then callers must only cast the pointer back to MyMessage*, which means that implementations that return Message* must be strictly disallowed by the Zeolite compiler.
  2. If an implementation returns Message* then callers must only cast the pointer back to Message*, which means that implementations that return MyMessage* must be strictly disallowed by the Zeolite compiler.

I guess another alternative here is to just have a builtin Pointer<#x> (no variance!) whose only purpose is to allow pointer passing. Then an implementation could provide multiple functions to get the pointer as different types.

// Add to builtin.0rp and make it unboxed in BoxedValue.
@value interface Pointer<#x> {
  // Maybe just skip inheriting anything and disallow nullptr.
}

@value interface Message {
  asMessage () -> (Pointer<Message>)
  // other stuff
}

concrete MyRequest {
  refines Message

  @value asMyRequest () -> (Pointer<MyRequest>)
}

concrete MyResponse {
  refines Message

  @value asMyResponse () -> (Pointer<MyResponse>)
}

concrete GeneratedMyGrpcServiceClient {
  callSomeMethod (MyRequest) -> (ErrorOr<MyResponse>)
}
  1. The type substituted in for #x in Pointer is really only for Zeolite type checking. It seems easier to just reuse the category type as the param value.
  2. All concrete categories above would be implemented in C++, maybe automatically generated at some point.
  3. GeneratedMyGrpcServiceClient.callSomeMethod would call MyRequest.asMyRequest in C++ and then AsPointer() to get a void* that can be cast to the correct type. It can also construct an empty MyResponse and asMyResponse+AsPointer() to get a pointer to pass to the C++ gRPC implementation.
  4. Not sure if Message would be that useful, since the C++ Message base class actually contains a lot of function implementations (e.g., reflection) that couldn't be implemented in a @value interface.