TomasMikula / libretto

Declarative concurrency and stream processing library for Scala
Mozilla Public License 2.0
196 stars 6 forks source link
concurrency declarative-programming expressiveness functional-programming linear-types resource-safety session-types stream-processing type-safety

Libretto

Maven Central

Declarative concurrency and stream processing library for Scala.

Motivation

... or why another concurrency and stream processing library.

Libretto grew out of frustration with existing libraries. Here is an attempt to summarize the pain points.

Features

Libretto in a Nutshell

Libretto takes inspiration from linear logic, with a twist of interpreting the multiplicative conjunction, ⊗, in a concurrent fashion.

The correspondence of linear logic to certain monoidal categories led to a point-free notation that enforces linearity statically, despite the lack of linear types in the host language.

There is also lambda syntax, whose linearity checking is deferred until assembly time (i.e. when constructing the Libretto blueprint, before execution).

Primitives for racing and recursion were added.

Documentation

Scaladoc (Found something undocumented or not documented clearly? We should do better. Do not hesitate to submit a documentation request.)

Tutorial

Presentations

Typed Interaction with Session Types (using Scala and Libretto). Functional Scala 2022 [video] [slides]\ Shows how Libretto expresses session types.

Custom Stream Operators Made Safe And Simple with Libretto. Scalar 2023 [video] [slides]\ Shows the expressiveness of Libretto in implementing stream operators. Presents examples that are hard to express in other stream libraries.

Concurrent All The Way Down (Functional Concurrency with Libretto). LambdaDays 2023 [video] [slides]\ Concurrency without threads or side-effects.

When Your DSL Needs to Support User-Defined Domain Functions. ScalaDays 2023 [video] [slides]\ Presents the DSL embedding technique behind Libretto: delambdification followed by "compiling to categories" (i.e. to point-free representation).

Monads Are Not About Sequencing. Functional Scala 2023 [video] [slides]\ One of the presented examples, the concurrent, non-deterministic Writer, is in Libretto.

Caveats

It is all too common for software projects to highlight only the good parts or to overstate their features. We value your time and want to be absolutely honest about the limitations. However, it is hard to know what we are missing. That's why we need your help.

Do you find that libretto does not hold up to some of its promises? Do you find that the project description omits some important limitation? Do you feel that you have spent too much time to find out something that could have been presented more readily? Please, let us know.

Batteries not included. The streams library is under-developed. I/O library (file, network) is non-existent at the moment. There are currently no connectors or wrappers for things like HTTP servers, DB connectors, message brokers, ...

Not yet a good story for supervision/error recovery.

Flawed proof-of-concept implementation. The current implementation is leaky. It is not a design flaw, just an implementation flaw.

Best practices not established. Programming in Libretto is quite different from conventional concurrency libraries. It is not clear what the best practices should be.

Not used to this level of concurrency. In conventional programming languages and libraries, concurrency is a scarce special thing. In Libretto, concurrency is the default and sequentiality is a special thing. The consequences might be surprising.

Q&A

Did not find an answer to your question? Do not hesitate to ask us.

What is Libretto for?

Libretto is a library for implementing concurrent systems and stream processing in Scala. Its focus is on the programming model, i.e. the generic glue to structure concurrent programs, rather than any specific use cases.

Who is Libretto for?

You are more likely to appreciate Libretto if you:

Is Libretto production-ready?

No. See also Caveats.

The goal for now is to present a different approach to concurrent programming and to excite a small number of enthusiasts to play with it, explore it and push it further.

The expectation is that people will find the Libretto approach worthwhile regardless of unstable API, non-existent ecosystem of libraries, flawed proof-of-concept implementation, or unknown performance characteristics.

What do you mean by declarative, anyway?

Declarative programming is a programming paradigm in which we describe the computation without giving instructions to the underlying execution model. Examples of such instructions are updating a shared mutable variable or spawning or joining a thread/fiber/actor.

Instructions to the underlying execution model are side-effects, so we can say that declarative programs are programs without side-effects. In particular, all expressions in a declarative program must be referentially transparent.

Aren't IO monad-style programs referentially transparent as well?

Not really.

They are referentially transparent from the point of view of the host language (Scala). The expression IO { println("Hi!") } is referentially transparent in that given

val x = IO { println("Hi!") }

we can freely replace occurences of x by IO { println("Hi!") } (and vice versa) without changing the meaning of the program. There is no denying that this already has great benefits for compositionality and reasoning about programs.

However, if we view the IO monad as a new semantic level and having to wrap some expressions in IO { ... } as just a syntactic annoyance for the sake of embedding IO programs in Scala, then referential transparency at this new IO level would require the following two programs to have the same meaning

for {
  x <- IO { println("Hi!") }
  y <- IO { println("Hi!") }
} yield y
for {
  x <- IO { println("Hi!") }
  y <- IO { x }
} yield y

but they do not.

Saying that (the expressions in) these programs are referentially transparent (even though at the Scala level they are) is somewhat like saying that any program is referentially transparent when viewed as a text file in a file system.

Note: The way Libretto circumvents the issue of programs like these having different behavior is linearity: If, by some stretch of the imagination, these were Libretto programs, the first one would not typecheck, because x is unused.

How can Libretto statically guarantee that each resource is consumed exactly once when Scala does not have linear type system features?

Libretto programs are not composed of Scala functions A => B. They are composed of Libretto's linear functions A -⚬ B, which are linear by construction.

What exactly are the primitives of Libretto, from which everything else is derived?

There is a hierarchy of DSLs, partially ordered by their power. At the bottom, i.e. the weakest, is currently CoreDSL.

An extension of CoreDSL that is of particular interest is Scaletto. It adds support for using Scala values and pure Scala functions, managing resources that are Scala objects, performing effects on those resources, and marking thread-blocking Scala calls.

Does libretto support dynamic stream topologies?

Yes.

By dynamic stream topologies we mean data flows that do not have a rigid structure predetermined at compile-time, but depend on runtime values or runtime load.

Do libretto streams support feedback loops?

Yes, loops, where a stream from system's output is fed back into the system's input can be implemented easily. Note, however, that sending data around the system might not be a great way to do feedback. As data in Libretto can flow in both directions, the feedback can flow backwards through the system.

Is there support for timers?

Yes, the basic operation is delay:

def delay(d: FiniteDuration): Done -⚬ Done

How do I communicate with the external world (I/O)?

CoreDSL does not have any means of I/O.

Its extension, Scaletto, has means to wrap Scala resources and perform effects on them.

Another approach, although much more involved, is to

Does libretto support supervision of a subsystem?

Not yet.

See #8.

Can I execute different parts of the system on different execution contexts/thread pools?

Not yet.

The current implementation uses one main thread pool and another one for blocking operations. This is sufficient for most applications.

See #6.

Does libretto have fibers, as known from ZIO or Cats Effect?

No.

Fibers are used to manually control concurrency. In Libretto, any two computations that do not causally depend on each other take place concurrently, without any explicit instructions like spawn-ing a fiber.

Are Libretto programs any more amenable to inspection than IO monad programs?

IO monad programs are hidden inside an impenetrable Scala function after the first flatMap, and thus not accessible to any kind of inspection. Libretto, too, allows to incorporate Scala functions into the program. One might therefore ask whether Libretto programs are any more amenable to inspection than IO monad programs.

The answer is yes.

The difference is that in Libretto, uses of opaque Scala functions are local. This is unlike IO monad programs, where a Scala function passed to flatMap produces all the rest of the program, at runtime, thus making it impossible to inspect the program before it is run.

Given a Libretto program

f > mapVal(g) > h

where g is a Scala function and f, h are Libretto arrows, the content of g is opaque, but both f and h can be inspected before the program is run.

Does libretto support session types?

Yes.

Why are there two different function-like symbols? What is the difference between -⚬ and =⚬?

-⚬ =⚬
Function as code Function as data
Describes a program. Lives in a running program.
Tangible: we create and manipulate Scala values of this type. Intangible: there are no Scala values of this type. A type like A =⚬ B can appear only to the left or right of -⚬.
Value in the meta language (Scala). As such, it can be used any number of times (including zero) by Scala code. Resource in a Libretto program. As such, it must be consumed (evaluated) exactly once, because it might have captured other resources that are consumed on evaluation.
Morphism Object, Internal Hom

In category theory, one does not look inside objects. Everything is expressed in terms of morphisms. In particular, objects in a category, which in our case are types like =⚬, |*|, |+|, |&|, One, Val, ..., are not viewed as collections of elements. Indeed, there are no values of these types. We express everything in terms of morphisms, -⚬. We manipulate morphisms as values in the underlying meta language (Scala).

Notice how the distinction between a morphism (-⚬), an internal hom (=⚬) and a mapping of the underlying meta theory (=>) avoids a lot of confusion. In the category of Scala functions, all three of these roles are played by =>.

In the signature of curry on Scala functions, => appears in all three of these roles. Can you tell which occurrences play which roles?

def curry[A, B, C]: ((A, B) => C) => (A => (B => C))

The Libretto version of curry makes the roles clear:

def curry[A, B, C]: ((A |*| B) -⚬ C) => (A -⚬ (B =⚬ C))

Is the Libretto function -⚬ an Arrow?

No.

By Arrow we mean the Arrow typeclass.

-⚬ is not an Arrow, because

How to type the symbol used in -⚬ and =⚬?

is Unicode code point U+26AC (medium small white circle).

You can use any input method that let's you enter a Unicode symbol via its code. For example, on macOS you can add Unicode Hex Input to your System Preferences > Keyboard > Input Sources. Using this input source, you can type by pressing Alt+26ac.

IntelliJ IDEA

In IntelilJ IDEA, we can define a Live Template that expands -o (where o is the Latin letter 'o') to -⚬:

  1. Go to Preferences > Editor > Live Templates > scala.
  2. Click the Add button and choose Live Template.
  3. Fill out the definition:
    • Abbreviation: -o (with the Latin letter 'o')
    • Description: e.g. lollipop
    • Template text: -⚬ (with the Unicode code point U+26AC)
    • Expand with: e.g. Space

Now, when you type -o␣ in a Scala file, the editor expands it to -⚬.

Visual Studio Code

In Visual Studio Code, we can add a user snippet triggered by -o (where o is the Latin letter 'o') which expands to -⚬:

  1. In Preferences > User Snippets click on New Global Snippets file...
  2. Enter file name, e.g. libretto.
  3. In the pre-generated JSON file, add the field
      "lollipop": {
        "prefix": "-o",
        "body": "-⚬",
        "description": "types the lollipop arrow (-⚬)"
      }

    and save.

Now, when you type -o in the editor, a popup should appear with a suggestion to insert -⚬, which you confirm by pressing Enter. (Sometimes you might need to press Ctrl+Space for the suggestion popup to appear.)

Does Libretto prevent deadlocks?

No. It is easy to write a Libretto program that gets stuck because of cyclic dependencies.

Some concurrency libraries that claim to prevent deadlocks use the word deadlock in a very specific sense, namely system threads waiting on each other in a cycle. Thread is not even a term from Libretto's vocabulary, but you can safely assume that any implementation of Libretto does not cause deadlocks on the level of threads, either.

Could deadlocks be prevented statically, in principle?

Yes.

Trivially, by restricting the DSL so that it is impossible to set up a loop in the data flow. However, the resulting loss of expressiveness would make it unusable for most applications.

A sufficiently expressive deadlock-free DSL would certainly require more sophisticated types. We are leaving that for further research and most likely for a different host language than Scala.

Why do we need negative values?

To avoid callbacks in bidirectional dataflows.

A negative value Neg[A] could alternatively be expressed as a callback Val[A] =⚬ One, but we want to avoid callbacks.

What's the problem with callbacks?

How does Libretto compare to lchannels?

lchannels is another Scala library with support for session types. Its encoding of session types is based on continuation-passing style classes and ordinary Scala functions. As a result, it is more lightweight in terms of the required machinery, but linearity is only checked at runtime. lchannel's heavy reliance on Scala functions preclude it from having Libretto's level of concurrency and analyzability, but those are not among stated goals of the project.

Copyright and License

Copyright 2020–2023 Tomas Mikula

This software is distributed under the Mozilla Public License 2.0, see the LICENSE file for details.