spy16 / sabre

Sabre is highly customisable, embeddable LISP engine for Go. :computer:
GNU General Public License v3.0
28 stars 5 forks source link

Sabre Experience Report #28

Closed lthibault closed 3 years ago

lthibault commented 4 years ago

Hi @spy16, I hope you're well and that things are returning to normal on your end.

As mentioned over in https://github.com/spy16/sabre/pull/26 and https://github.com/spy16/sabre/pull/27, I've been making heavy use of Sabre over the past few weeks in the context of Wetware, so I thought I'd share my thoughts on what works and what can be improved.

I'm well aware that the current runtime design has shown some limitations, and comfortable with the fact that parts of Wetware will have to be rewritten once we iron out the creases. My goal in publishing this is to:

This report is structured as follows:

  1. Context: A detailed description of Wetware, and how Sabre fits into the picture. It's a bit lenghty because I wanted to err on the side of completeness. Please forgive me as I slip into pitch-mode from time to time :sweat_smile:!
  2. The Good Parts of working with Sabre. This section highlights where Sabre provides a net gain in productivity, and was generally a delight to use.
  3. Pain Points, Papercuts and Suggestions. This section highlights where Sabre could be improved. I've tried to be as specific as possible, and to link to Wetware source code where appropriate. Possible solutions to these issues are also discussed.
  4. Miscellanea that are generally on my mind. These are basically issues that I expect to encounter in the mid-to-near term. They are important, but not urgent.

Context

Wetware is a distributed programming language for the cloud. Think Mesos + Lisp. Or Kubernetes + Lisp, if you prefer.

Wetware abstracts your datacenters and cloud environments into a single virtual cloud, and provides you with a simple yet powerful API for building fault-tolerant and elastic systems at scale.

It achieves its goals by layering three core technologies:

1. A Cloud Management Protocol

At its core, Wetware is powered by a simple peer-to-peer protocol that allows hosts to discover each other over the network, and assembles them into a fully-featured virtual cloud.

This virtual cloud is self-healing (i.e. antifragile), truly distributed (with no single point of failure), and comes with out-of-the box support for essential cloud services, including:

Wetware's Cloud Management Protocol works out-of-the-box, requires zero configuration, and features first-class support for hybrid and multicloud architectures.

2. A Distributed Data Plane

Unifying data across applications is a major challenge for current cloud architectures. Developers have to deal with dozens (sometimes even hundreds) of independent applications, each producing, encoding and serializing data in its own way. In traditional clouds, ETL and other data operations are time-consuming, error-prone and often require specialized stacks.

Wetware solves this problem by providing

  1. High-performance, immutable datastructures for representing data across applications,
  2. High-throughput protocols for efficiently sharing large datastructures across the network, and
  3. An intuitive API for working concurrently with data in a cluster-wide, shared-memory space.

With Wetware's dataplane, you can coordinate millions of concurrent processes to work on terabyte-sized maps, sets, lists, etc. These immutable and wire-native datastructures protect you from concurrency bugs while avoiding overhead due to (de)serialization.

Lastly, Wetware's location-aware caching means you're always fetching data from the nearest source, avoiding egress costs in hybrid and multicloud environments.

3. A Dynamic Programming Language

The Wetware REPL is the primary means through which users interact with their virtual cloud, and the applications running on top of it. Unsurprisingly, this REPL is a Lisp dialect built with Sabre.

Let's walk through a few examples.

We can simulate a datacenter from the comfort of our laptop by starting any number of Wetware host processes:

# Start a host daemon.
#
# In production, you would run this command once on each
# datacenter host or cloud instance.
#
# In development, you can simulate a datacenter/cloud of
# n hosts by running this command n times.
$ ww start

Next, we start the Wetware REPL and instruct it to dial into the cluster we created above.

# The -dial flag auto-discovers and connects to an
# arbitrary host process.  By default, the shell runs
# locally, without connecting to a virtual cloud.
$ ww shell -dial

We're greeted with an interactive shell that looks like this:

Wetware v0.0.0
Copyright 2020 The Wetware Project
Compiled with go1.15 for darwin

ww Β»

From here, we can list the hosts in the cloud. If new hosts appear, or if existing hosts fail, these changes to the cluster will be reflected in subsequent calls to ls.


ww Β» (ls /) ;; list all hosts in the cluster
[
    /SV4e8BwRMPmShMPRcfTmpfTZQN7JQFaqzwt9g2wrF5bj
    /cie5uM1dAuQcbTEpHi4GKsghNVBk6H4orjVs6fmd16vV
]

The ls command returns a core.Vector, which contains a special, Wetware-specific data type: core.Path. These paths point to special locations called ww.Anchor. Anchors are cluster-wide, shared-memory locations. Any Wetware process can read or write to an Anchor, and the Wetware language provides synchronization primitives to deal with the hazards of concurrency and shared memory.

Anchors are organized hierarchically. The root anchor / represents the whole cluster, and its children represent physical hosts. Children of hosts are created dynamically upon access, and can contain any Wetware datatype.

;; Anchors are created transparently on access.
;; You can retrieve the value stored in an Anchor
;; by invoking its Path without arguments.
;; Anchors are empty by default.
ww Β» (/SV4e8BwRMPmShMPRcfTmpfTZQN7JQFaqzwt9g2wrF5bj/foo)
nil

;; Invoking a Path with a single argument stores
;; the value in the corresponding Anchor.
ww Β» (/SV4e8BwRMPmShMPRcfTmpfTZQN7JQFaqzwt9g2wrF5bj/foo
   β€Ί   {:foo "foo value"
   β€Ί    :bar 42
   β€Ί    :baz ["hello" "world"] })
nil

;; The stored value is accessible by _any_ Wetware
;; process in the cluster.
;;
;; Let's fetch it from a goroutine running in the
;; remote host `cie5uM1...`
ww Β» (go /cie5uM1dAuQcbTEpHi4GKsghNVBk6H4orjVs6fmd16vV)
   β€Ί   (print (/SV4e8BwRMPmShMPRcfTmpfTZQN7JQFaqzwt9g2wrF5bj/foo)))
nil

Why did this print nil? Because the form (print (/SV4e8.../foo)) was executed on the remote host cie5uM...! That is, the following things happened:

  1. A network connection to cie5uM... was opened.
  2. The list corresponding to the print function call was sent over the wire.
  3. On the other side, cie5uM... received the list and evaluated it.
  4. During evaluation, cie5uM... fetched the value from the Sv4e8.../foo Anchor and printed it.

If we were to check cie5uM...'s logs, we would see the corresponding output.

Important Technical Note: Wetware's datastructures are implemented using the Cap'n Proto schema language, meaning their in-memory representations do not need to be serialized in order to be sent across the network.

Our heavy reliance on capnp has implications for the design of varous Sabre interfaces, as discussed in part 2.

This concludes general introduction to Wetware.

While Wetware is very much in a pre-alpha stage, the foundational code for features 1 - 3 are in place, and the overall design has been validated. Now that we are leaving the proof-of-concept stage, developing the language (and its standard library) will be the focus of the next few months. For this reason, Sabre will continue play a central role in near-term development and I expect to split my development time roughly equally between Wetware and Sabre. As such, I'm hoping the following feedback can serve as a synchronization point between us, and motivate the next few PRs.

The Good Parts

(N.B.: I am exclusively developing on the reader branch, which is itself a branch of runtime.)

Overall, Sabre succeeds in its mission to be an "80% Lisp". The pieces fit together quite well, and most things are easily configurable. This last bit is particularly true of the runtime branch where I was able to write custom implementations for each atom/collection, as well as create some new, specialized datatypes. I have not encountered any fundamental design flaws, which is great!!

The REPL is a breeze to use, requring little effort to set up and configure. This is in large part thanks to your decision to make REPL (and Reader for that matter) concrete structs that hold interfaces internally, as opposed to declaring them as interface types. Doing so allows us to inject dependencies via functional arguments rather than re-writing a whole new implementation just to make minor changes to behavior. The result is a REPL that took me less time to set up than to write this paragraph, so this is a pattern we should definitely continue to exploit.

Relatedly, I think these few lines of code really showcase the ergonomics of functional options. They compose well, are discoverable & extensible, and visually cue the reader to the fact that the repl.New constructor is holding everything in the package together. I'm disproportionately pleased with the outcome.

Lastly, the built-in datatypes are very useful when developing one's own language because they serve as simple stubs until custom datastructures have been developed. In practice, this means I was able to develop other parts of the language in spite of the fact that e.g. Vectors had not yet been implemented in Wetware. It's hard to overstate not only how incredibly useful this is, and how much of that usefulness stems from the fact that Sabre is using native Go datastructures under the hood. Designing one's own language is quite hard, so every ounce of simplicity and familiarity is a godsend. I am strongly in favor of maintaining the existing implementations and not adding persistent datatypes for this reason. An exception might be made for LinkedList since the current implementation is dead-simple and shoe-horning a linked-list into a []runtime.Value is a bit ... backwards. In any case, Sabre really came through for me, here.

Pain Points, Papercuts & Suggestions

I want to stress that this section is longer than its predecessor not because there are more downsides than upsides in Sabre, but because there's always more to say about problems than non-problems! With that said, I've sorted the pain-points I've encountered into a few broad buckets:

  1. Error handling
  2. Design of container types
  3. Reader design

Error Handling

By far the biggest issue I encountered was the handling of errors inside datastructure methods. Throughout our design discussion in #25, our thinking was (understandably) anchored to the existing implementations for Map, Vector, etc. Specifically, we assumed that certain operations (e.g. Count() int) could not result in errors. This turns out to have been an incorrect assumption.

As mentioned in the Context section above, Wetware's core datastructures are generated from a Cap'n Proto schema. As such, simple things such as calling an accessor function often return errors, including for methods like core.Vector.Count(). The result is that my code is quite panicky: Count, Conj, First and Next all panic.

While there are (quite convoluted) ways of avoiding these panics, I think there's a strong argument for changing the method signatures to return errors. Sabre is intended as a general-purpose build-your-own-lisp toolkit, and predicting what users will do with it is nigh impossible. For example, they may write datastructures implemented by SQL tables, which make RPC calls, or which interact with all manner of exotic code. As such, I think we should take the most general approach, which means returning errors almost everywhere.

Design of Container Types

This issue is pretty straightforward. I'd like to implement an analog to Clojure's conj that works on arbitrary containers. Currently, runtime.Vector.Conj returns a Vector, so I'm wondering how this might work. Do you think it's best to resort to reflection in such cases? Might it not be better to return runtime.Value from all Conj methods?

Reader Design

Despite being generally well-designed, there is room for improvement in reader.Reader.

Firstly, https://github.com/spy16/sabre/pull/27 adds the ability to modify the table of predefined symbols, which was essential in my case as I have custom implementations for Nil and Bool.

Secondly, relying on Reader.Container to build containers is not appropriate for all situations. The Container method reads a stream of values into a []runtime.Value, and returns it for further processing. In the case of Wetware's core.Vector, this is quite inefficient since:

  1. I need to allocate a []runtime.Value.
  2. I might need to grow the []runtime.Value, causing additional allocs, but I can't predict the size of the container ahead of time.
  3. Once the []runtime.Value is instantiated, I have to loop through it and call core.VectorBuilder.Conj, which also allocates.

In order to avoid the penalty of double-allocation, I wrote readContainerStream, which applies a function to each value as it is decoded by the reader. The performance improvement is significant for large vectors, so I think we should add it as a public method to reader.Reader.

Thirdly, Wetware's reliance on Cap'n Proto means that I must implement custom numeric types. To make matters more complicated, I would like to add additional numeric types analogous to Go's big.Int, big.Float, and big.Rat. As such, I will need the ability to configure the reader's parsing logic for numerical values.

Currently, numerical parsing is hard-coded into the Reader. I suggest adding a reader option called WithNumReader (or perhaps WithNumMacro?) that allows users to configure this bit of logic. I expect this will also have repercussions on sabre.ValueOf, but it should be noted that this function is already outdated with respect to the new runtime datastructure interfaces.

Miscellanea

Lastly, a few notes/questions that are on my mind, but not particularly urgent:

  1. The Position type seems very useful, but I'm not sure how it's meant to be used. Who is responsible for keeping it up-to-date, exactly? Any "use it this way" notes you might have would be helpful.
  2. I don't quite understand the distinction between GoFunc, Fn and MultiFn. Best I can figure, GoFunc is used to call a native Go function from Sabre, while (Multi)Fn is meant to be dynamically instantiated by defn? From there, I assume MultiFn is used for multi-arity defn forms? (I think I might have answered my own question :smile:)
  3. I expect to start thinking about Macros in 6-8 weeks or so. Are there any major changes planned for the macro system, or can I rely on what already exists?
  4. I'm going to tackle goroutine invokation within the next 2-3 weeks and will keep you appraised of my progress in #15. If you have any thoughts on the subject, I'm very much interested.

Conclusion

I hope you find it as useful to read this experience report as I have found it useful to write. I'm eager to discuss all of this at your earliest convenience, and standing by to help with implementation! :slightly_smiling_face:

spy16 commented 4 years ago

Oh wetware sounds amazing!πŸ”₯ I am really happy sabre has turned out to be useful in such a cool project.

Most of the concerns you have mentioned are very valid and actually aligned with the concerns I have had myself.

I'll get back to you over this weekend and we can figure out the next steps here based on this great report and the issues I have been facing with the new runtime model.

lthibault commented 4 years ago

Oh wetware sounds amazing!πŸ”₯ I am really happy sabre has turned out to be useful in such a cool project.

That's great to hear! 😊 I'll be sure to keep you in the loop!

Most of the concerns you have mentioned are very valid and actually aligned with the concerns I have had myself.

Brilliant - synchronization achieved!

I'll get back to you over this weekend and we can figure out the next steps here based on this great report and the issues I have been facing with the new runtime model.

Sounds like a plan. Looking forward to it!

spy16 commented 4 years ago

Some of the issues I have been seeing with runtime model and the previous model:

  1. Evaluation was handled by the Value type themselves by having Eval() method. Which seemed to provide lot of flexibility but actually compromised simplicity (e.g., even those values which didn't have evaluation required implementing Eval()) and also extendibility. (e.g., I tried to add max-stack-depth as a safety feature for a sandboxed environment, but because eval and invocation logic were in List.Eval(), it meant that the runtime implementations needed to expose stack manipulation functions as well).
  2. Analysis & macro expansions were especially hard since some types were defined in terms of interfaces and the runtime needed to know how to construct values of these types (e.g., when vector is being analysed & evaled, result need to be another vector) when they are being analysed which again made things complicated.

At a high level, the problem turned out to be too many moving parts due to interfaces. The approach I now have in mind combines ideas from Clojure, Joker, zygomys and to some extent Rob Pike's lisp. I have setup an experimental repository Zulu which implements these ideas (Just created a separate repository to avoid any confusion, but will be moved to Sabre once finalised - on the other hand, I kinda like the name πŸ˜…).

Highlights:

Pros:

  1. Highly flexible model. (For example, if Vector model doesn't match the use-case, it would be possible to build custom analyser and a VectorExpr to extend the VM).
  2. Some parts remain concrete (i.e., VM) and reduces the permutations possible with different implementations and hence all the edge cases. Also leaves room for some optimisations (I think).
  3. Ability to maintain a stack which also allows controlling the stack depth and creating stack-traces.
  4. GoExpr can be added to implement goroutines support (Need to think about this more. But i think it should be doable easily this way)

Cons:

  1. No more Value type and Any is an interface{} type. (Although, I don't think this is exactly a con since with Value type model, all Go values had to be converted to a matching Value type by reflection. This is entirely avoided)

Do take a look at Zulu and let me know your thoughts.

Addressing Issues in the Report:

  1. Error Handling: I have definitely had this in mind from the start. Another example to add to your list is an implementation of HashSet/Map. All values are not hash-able and hence errors are definitely possible here as well. Above proposed redesign will be able to handle this issue as well and also, it is definitely a requirement to introduce error return value into these contracts.
  2. Design of container types: I think with the proposed model, this issue can be resolved easily. (Since the design/contracts of Vector itself remains outside of the core sabre model).
  3. Reader Design: Agreed to all 3 issues here.
lthibault commented 4 years ago

Just created a separate repository to avoid any confusion, but will be moved to Sabre once finalised - on the other hand, I kinda like the name πŸ˜…

Personally, I thought Parens was the best name. I was sad to see it go! πŸ˜‰

Anyway, I re-read the relevant chapters of SICP over the weekend and then had a look at Zulu on Sunday evening. In a word: πŸ‘ πŸ‘ πŸ‘

In fact, you forgot one of the more significant "pros" in your list: a major gain in efficiency, stemming from the fact that we only perform syntactic analysis once for each expression! As you've noticed in previous threads, I'm always a bit performance-conscious, so this is a big win in my book.

My only major thoughts/concerns are as follows:

GoExpr can be added to implement goroutines support (Need to think about this more. But i think it should be doable easily this way)

I noticed that VM is not thread-safe. What are the implications for GoExpr and https://github.com/spy16/sabre/issues/15?

In terms of design, I see two possibilities:

  1. Some kind of lock on the interpreter (🀒)
  2. Some kind of per-goroutine stack

I'd prefer to avoid global locking, since it undermines much of the power of Go's M:N thread model.

Similar to deriving context.Contexts, could we somehow "derive" a new VM? The procedure would look something like:

  1. Allocate new VM
  2. Set the top-most stack frame from the parent as the first (i.e. "global") stack frame in the child

I haven't thought through all the implications, so maybe this approach is flawed. In any case, this is by far my biggest concern right now.

https://github.com/spy16/zulu/blob/master/zulu.go#L63 https://github.com/spy16/zulu/blob/master/zulu.go#L70

I noticed that VM.Eval uses value.Nil{} directly. What happens if I want to use my own implementation of Nil?

Wetware currently aliases runtime.Nil, and I don't necessarily expect this to change. However, I would be a bit more comfortable if atoms were fully swappable.

Some parts remain concrete (i.e., VM) and reduces the permutations possible with different implementations and hence all the edge cases. Also leaves room for some optimisations (I think).

Although not a huge priority right now, I'm curious as to what optimizations you have in mind.

Addressing Issues in the Report: [...]

πŸ‘ on all points.

spy16 commented 4 years ago

Regarding the GoExpr, I do not have any concrete ideas at the moment to be honest. But I have the same high level idea in mind as you. i.e., to copy the VM instance and launch the goroutine with that as the context. (That's why not sure if it would be more appropriate to rename VM to Context perhaps - but I do like VM πŸ˜…).

For not depending on value.Nil I guess a way would be to return native nil and expect the user to handle.

On the name, we could go and do the refactor on parens and archive sabre. But this needs to be the last time we do something like this. 🀣

lthibault commented 4 years ago

I have the same high level idea in mind as you. i.e., to copy the VM instance and launch the goroutine with that as the context.

πŸ‘ SGTM.

That's why not sure if it would be more appropriate to rename VM to Context perhaps - but I do like VM πŸ˜…

Yeah, I do too. It gives me some idea of what it's actually doing, under the hood. In particular, I can infer that it's probably some sort of stack machine. In general, I don't like the word Context because it doesn't really tell us anything ... it's like Data in that respect. Everything is data, and everything is context.

OTOH, I don't expect there to be more than one VM in a given process... πŸ€”.

SICP would call this an "Evaluator". The -or suffix suggests an interface, but in this case I think it's still okay.

One thing we could do to resolve both the naming convention and the concurrency question is to factor the stack out of the VM, making it stateless. Then, each goroutine would be responsible for passing its stack to Eval explicitly. The trade-off is that Analyzer and Expander must now be stateless, or at the very least thread-safe, but I don't think that's actually a problem.

To illustrate, we might do something like this:

// Execution context for an Expr.  Must be exported since it's part of VM.Eval's call signature.
//
// N.B.:  consider making this an interface, in case users want to supply their
//           own implementation, which might be optimized for a specific use-case.
//           e.g:  maybe I want an immutable stack based on a linked-list.  Or maybe
//           I want to bind a `context.Context`, or something.
type Context struct {
   stack    []stackFrame
   maxDepth int   
}

func (s Stack) push(f stackFrame) { ... }
func (s Stack) pop() stackFrame { ... }

type Expr interface {
    Eval(Context) (value.Any, error)
}

 // only one of these per process
type VM struct{ 
        analyzer Analyzer
    expander Expander
}

func (vm *VM) Eval(c Context, form value.Any) (value.Any, error) { ... }

The areas of responsibility are delimited as follows:

For not depending on value.Nil I guess a way would be to return native nil and expect the user to handle.

I kind of like this. It keeps the two layers of abstraction (form evaluation vs datatype) clearly separated. I'm inclined to say it's worth the extra if v == nil check.

On the name, we could go and do the refactor on parens and archive sabre. But this needs to be the last time we do something like this. 🀣

Haha I'm in favor. We can always claim it's to maintain backwards compatibility in Sabre πŸ˜† (But yes, let's make it the last time!)

lthibault commented 4 years ago

Another thought just popped into my head: how would you feel about moving the parens repo to its own organization? Having an org signals a couple of positive things for a project:

  1. More than one maintainer
  2. "This is not a weekend project that will be abandoned"
  3. The project is expecting to grow into a thriving community

I think it would be a good move, but totally understand if you'd prefer to keep it under spy16/.

spy16 commented 4 years ago

Both names parens and sabre are taken 😐 I wouldn't mind moving it to its own organisation if we come up with a better name that is available.

lthibault commented 4 years ago

How about we just call the org go-parens? The full repo would then be github.com/go-parens/parens.

Else, does e.g. the Hindi word for parentheses have a nice ring to it? We could always use its phonetic spelling in the latin alphabet.

spy16 commented 4 years ago

I think for now, we can keep it under spy16/parens... (Not a solid reasoning and is more of a feeling i guess. But, I see it as a small library that does one small thing very nicely. With a dedicated org, I feel it becomes this big project that needs to have lot of features πŸ˜…)

If this works for you, i will delete the git history of the current parens repo and bootstrap it (I know that's not ideal way to do it, but technically the project is entirely different - we could start with a new repo as well - not sure what would be better though). I will also create small and independent issues on different tasks that need to be done (an issue on the VM itself, an issue on reader, an issue on analyzer etc.)

lthibault commented 4 years ago

I think for now, we can keep it under spy16/parens

Sure, makes perfect sense πŸ‘

If this works for you, i will delete the git history of the current parens repo and bootstrap it

Sounds good! (And no worries -- I'm an occasional, clandestine user of git push -f , so I totally get it πŸ˜† )

I will also create small and independent issues on different tasks that need to be done (an issue on the VM itself, an issue on reader, an issue on analyzer etc.)

πŸ‘ BTW, I'm keen to tackle the concurrency issue we discussed yesterday once the fundamentals are in place. I'm starting to get a pretty good picture of how a stateless VM could work.