CrowdHailer commented 3 years ago

Distributed Gleam

By distributed I mean running in multiple places, potentially wider than an erlang cluster e.g. browser and server. Maybe we should call it Gleam Web Scale.

Here are the things I think we need to make something like this work

FFI for multiple backends

To do any useful work a program must always be able to talk to the out side world, and at somepoint this will be limited by the platform it runs on. My favourite approach would be something that allowed a module to define external functions to more than one platform, and then use tree shaking to check that no external function calls are made that don't have an FFI for the platform target.

e.g. lets have a very odd utility library

// utils.gleam
pub external fn system_time() -> Int
  js = "Date" "now"
  erl = "os" "system_time"

pub external fn get_host() -> String
  js = "location" "host"

pub external fn make_ref() -> Ref
  erl = "erlang" "make_ref"

The program below could be compiled for js but not erl, because tree shaking would find that all the external functions used have a js implementation but not an erl one.

import utils

pub fn main() {
  utils.system_time()
  utils.get_host
}

Note on tree shaking

I think tree shaking might not be the most simple thing to implement. A simpler version could have checks at the module level by creating multiple files and having some kind of annotation.

// utils_shared.gleam
targets = js erl

pub external fn system_time() -> Int
  js = "Date" "now"
  erl = "os" "system_time"

// utils_browser.gleam
targets = js
pub external fn get_host() -> String
  js = "location" "host"

Discovery and naming

To find processes on another node, a process must be able to start looking them. At least one process must be discoverable from the connection process.

// my_app/server.gleam
import gleam/node.{Interface}
import gleam/node/distributed_erlang
import gleam/process.{From}

pub const interface = Interface()

pub type Operation{
  Operation(From, fn(Int) -> Int)
}

fn loop(receive, current) {
  let Operation(from, operation) = receive()
  let current = operation(current)
  process.reply(from, current)
  loop(receive, current)
}

pub fn main(receive) {
  assert Ok(pid) = process.spawn_link(loop(_, 0))
  assert Ok(_) = distributed_erlang.run(interface, fn() {
    pid
  })
}

// my_app/client.gleam
import gleam/node/distributed_erlang
import my_app/server

pub fn inc(x) { x + 1 }
pub fn double(x) { 2 * x }

fn main(_) {
  let server_pid = distributed_erlang.connect(server.interface, "server@0.0.0.0")
  try value = process.call(server_pid, server.Operation(_, double))
  try value = process.call(server_pid, server.Operation(_, inc))
  io.debug(value)
}

The program above is incredibly simple but shows a high level of abstraction between client and server, without hiding the distributed nature of the system because there are still explicit sends and receives (or in this case calls) which require error handling.

The interface constant has no value but it's type is parameterised by the values that will be made available at connection time. This means that it is a compilation error for node.run to return a different type than the type expected as a result on node.connect. This ensures type safety across the nodes.

This is the same design as process.spawn_link but enhanced so run and connect don't have to happen at the same time.

Note: In the distributed erlang case Node.run must be called only once. However it is possible that other transports could be defined. e.g. gleam/node/http. In such a case more than one server could be started.

the node functions are defined here where the todo values are transport implementations.

pub type Interface(public) {
    Interface
}

pub fn run(_i: Interface(public), start: fn() -> public) -> Result(Nil, Nil) {
    Ok(Nil)
}

pub fn connect(_i: Interface(public)) -> Result(public, Nil) {
    todo
}

Pids and addresses

When sent across boundaries pids need to be enhanced with information about the node they reside on. There are two ways to do this.

Always deal with absolute pids. so the pid object has a reference to the node in all places
```
let pid = process.spawn_link(node, fn() { ... })
```
The infrastructure that handles serializing pid information can add node information when sending the pid reference outside the node.

I opted with option 1 in GenBrowser, because I think it's simplest.

Having an absolute address type can handled named processes, ports, pids, and atomic. The address type needs to contain information about how to send a message to it's target from any node and would be part of the distributed runtime. This would allow you to send a message from anywhere to anywhere. This is Great for the programming model, your browser can send a message directly to an erlang port on the server.

Probably many worms in this can around security and performance, but I won't tackle them here.

Client Server

In example above there is a Clear client and server. There will always be one process/node initiating the connection, once connected messages should be equally easy to send in either direction.

// my_app/server.gleam
import gleam/node.{Interface}
import gleam/node/distributed_erlang
import gleam/process

pub const interface = Interface()

fn loop(receive) {
  let pid = receive()
  send(pid, 1)
  send(pid, 2)
  send(pid, 3)
  loop(receive)
}

pub fn main(receive) {
  assert Ok(pid) = process.spawn_link(loop(_))
  assert Ok(_) = distributed_erlang.run(interface, fn() {
    pid
  })
}

// my_app/client.gleam
import gleam/node/distributed_erlang
import my_app/server

fn loop(receive) {
  let message = receive()
  io.debug(message)
  loop(receive)
}

fn main(receive) {
  let server_pid = distributed_erlang.connect(server.interface, "server@0.0.0.0")
  try value = process.send(server_pid, self())
  loop(receive)
}

In fact it should be as easy to send messages client to client, by having the server send messages to the client with the pids of other clients. In such a case the node connection library taking care of routing.

Building

In the end I want a command that would build more than one artifact at the same time.

gleam build server --platform erlang client --platform js

The structure of applications in the beam and OTP complicates this somewhat

Upgrading types

This system allows us to check types across boundaries because we are building everything at the same time. I think this could have usecases, there are applications where the client doesn't have long lived state and where the client js is served by the server.

The simplest check would be to have a UUID generated at compile time and sent at connection with the connection denied it it was not the same. This would be a cracking a nut with a hammer approach, because a new deploy that changed nothing about the shape of the messages sent would result in connection being denied.

Interesting ideas here.https://www.youtube.com/watch?v=4T6nZffnfzg

Running multiple endpoints is possible when using transports that allow you to call Node.run more than once. so you could have an old and current version of the Api running at different endpoints.

lpil commented 3 years ago

Thanks for sharing this! There are some very cool and exciting ideas here.

The conditional compilation for different runtimes is something that I think we will need for sure. Let's split that into another issue.

The interface constant has no value but it's type is parameterised by the values that will be made available at connection time.

How does this type parameter get filled? Is done automatically by the compiler in some fashion? It would be nice if changing whitespace or implementation in an inert way did not alter this.

In the end I want a command that would build more than one artifact at the same time.

Could we run the compiler twice? Is it better to run it once?

CrowdHailer commented 3 years ago

How does this type parameter get filled? Is done automatically by the compiler in some fashion?

Gleam magic does this, I don't know how it works. It just get's filled by usage.

e.g. it checks this connect call https://github.com/midas-framework/distributed/blob/master/examples/ping_pong/src/ping_pong/application.gleam#L15 matches the start call with the same interface.

Could we run the compiler twice? Is it better to run it once?

It could be a double pass. it's just about avoiding the case where you run the compiler once for js, then go oops I better correct that typo and generate the erlang but forget to regenerate the js.

lpil commented 3 years ago

Oh I see, it is based upon the version of the release so it checks that the Versions are the same rather than that the two different versions are compatible in some fashion. This is actually a much easier problem, good idea!

CrowdHailer commented 3 years ago

Yeah, I can't even explain in words what compatible but different would mean. But a check that says a set of nodes are safe if running the same version seems a useful step

gleam-lang / suggestions

An approach to Distributed Gleam #108

Distributed Gleam

FFI for multiple backends

Note on tree shaking

Discovery and naming

Pids and addresses

Client Server

Building

Upgrading types