Open davidar opened 9 years ago
As for code that requires unbounded recursion, Morte code is able to generate any code. I’d be perfectly happy with a guaranteed-safe ipfs
command that gives me an unsafe executable I can choose to run.
That doesn’t necessarily complicate the programmer’s life. For example, Haskell’s IO
is an EDSL that looks like imperative code but actually retains referential transparency. The evaluation of an IO
expression has no side effects and generates an unsafe program you can choose to execute.
How does this improve on distributing regular (Safe)Haskell or pure Lisp source code (or BLC) over the internet?
The unique features of morte
are:
morte
is just waste CPU/RAMmorte
expressions, meaning that they are self-contained have no external references or dependenciesThe main disadvantages of morte
is:
morte
is syntactically verboseThis makes it significantly more difficult to program directly in morte
. It's more appropriate for a low-level intermediate language than a user-facing language. annah
somewhat alleviates the second and third problems but the first problem remains because I still haven't written a type inference engine, yet.
What limitations does the lack of recursion have in practice, and is this outweighed by the guarantees it allows you to make about the code?
Much fewer limitations than you think! The key is that recursive data structures and functions can be mechanically translated into non-recursive equivalents. There's an excellent paper which describes the algorithm behind this and the original post on morte
gives an example of translating the recursive list type and list operations to their non-recursive analogs. The annah
compiler automates the translation outlined in that paper.
What's really neat is that the static linking property I mentioned holds even in the presence of "recursive" data types. For example, if I have something like the following mutually recursive Haskell data types:
data Even = Zero | SuccE Odd
data Odd = SuccO Event
... annah
knows how to translate the Even
and Odd
types and all of their constructors to non-recursive encodings that have no interdependencies between each other. In other words, the SuccE
constructors' encoding does not refer to or depend on the Even
/Odd
types or any of the other constructors. This post goes into more detail about how static linking works in morte
.
Note that just because it is possible to transform recursive programs into non-recursive versions, doesn't mean that it is easy. In fact, it's actually a bit difficult for more complex algorithms because you have to handle every single possible corner case. If you want to see some really hairy example code, I have an implementation of binary numeral arithmetic in this directory and the implementation of (+)
for binary-encoded numerals is really long. In fact, that's actually the annah
code! The generated morte
code is even longer and harder to follow. Totality forces you to get a lot of things right and morte
doesn't let you opt out of totality. For example, in the encoding of binary numeral arithmetic I had to handle all sorts of corner cases that I would have never even realized existed otherwise. (Side note: if you use peano numerals then the implementation is dead simple, in comparison, but much slower to compile for large numbers).
The most important guarantee that the lack of recursion gives is that it's always safe to normalize expressions (i.e. you can safely inline everything) because the inlining process is guaranteed to terminate. This is why morte
gives better compile-time optimizations. If you try do this in a non-total language you run the risk of the compiler hanging indefinitely.
To make this concrete, here is an example of something that morte
will optimize away at compile time:
$ # First, some preliminary "imports" to make the code less verbose
$
$ # There is a Prelude of `morte` expressions which you can browse here:
$ # http://sigil.place/talk/0/
$
$ echo '#http://sigil.place/talk/0/Bool' > Bool
$ echo '#http://sigil.place/talk/0/Bool/False' > False
$ echo '#http://sigil.place/talk/0/List' > List
$ echo '#http://sigil.place/talk/0/List/Cons' > Cons
$ echo '#http://sigil.place/talk/0/Bool/and' > and
$
$ # This optimizes `\bs -> and (False:bs)` to `\bs -> False`
$
$ echo '\(bs : #List #Bool ) -> #and (#Cons #Bool #False bs)' | morte
∀(bs : ∀(List : *) → ∀(Cons : ∀(head : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool) → ∀(tail : List) → List) → ∀(Nil : List) → List) → ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool
λ(bs : ∀(List : *) → ∀(Cons : ∀(head : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool) → ∀(tail : List) → List) → ∀(Nil : List) → List) → λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → False
However, there are some things that morte
will not optimize away at compile time. For example, morte
will optimize True && b
to b
, but morte
will not optimize b && True
to b. The latter optimization might work in the future, but it does not, yet.
Also, remember that hugely complex implementation of (+)
above? Despite the complexity, morte
and annah
can still perform optimizations like this:
$ echo "#http://sigil.place/talk/0/Bin/(+) 0" | annah | morte
∀(b2 : ∀(Bin : *) → ∀(Zero : Bin) → ∀(One : (∀(Bin_ : *) → ∀(Nil_ : Bin_) → ∀(Zero_ : Bin_ → Bin_) → ∀(One_ : Bin_ → Bin_) → Bin_) → Bin) → Bin) → ∀(Bin : *) → ∀(Zero : Bin) → ∀(One : (∀(Bin_ : *) → ∀(Nil_ : Bin_) → ∀(Zero_ : Bin_ → Bin_) → ∀(One_ : Bin_ → Bin_) → Bin_) → Bin) → Bin
λ(b2 : ∀(Bin : *) → ∀(Zero : Bin) → ∀(One : (∀(Bin_ : *) → ∀(Nil_ : Bin_) → ∀(Zero_ : Bin_ → Bin_) → ∀(One_ : Bin_ → Bin_) → Bin_) → Bin) → Bin) → b2
In other words, morte
detected at compile time that \n -> 0 + n
is the same thing as \n -> n
.
I think the easiest way to answer your general concern about recursion is if you give me a simple problem that you think is difficult to encode in a non-recursive language and I actually write up a solution for it in morte
. A lot of people think that they need unbounded recursion when they actually don't.
One of the goals of morte
was very similar to the goal of IPFS, to have a first-class way of storing computations. For example, combining annah
and morte
, I can partially apply the (+)
function to its first argument and store that as a computation stored within the inc
file:
$ echo "#http://sigil.place/talk/0/Bin/(+) 1" | annah | morte > inc
... and then use that within a future computation:
$ echo "#inc 3" | annah | morte
∀(Bin : *) → ∀(Zero : Bin) → ∀(One : (∀(Bin_ : *) → ∀(Nil_ : Bin_) → ∀(Zero_ : Bin_ → Bin_) → ∀(One_ : Bin_ → Bin_) → Bin_) → Bin) → Bin
λ(Bin : *) → λ(Zero : Bin) → λ(One : (∀(Bin_ : *) → ∀(Nil_ : Bin_) → ∀(Zero_ : Bin_ → Bin_) → ∀(One_ : Bin_ → Bin_) → Bin_) → Bin) → One (λ(Bin_ : *) → λ(Nil_ : Bin_) → λ(Zero_ : Bin_ → Bin_) → λ(One_ : Bin_ → Bin_) → Zero_ (Zero_ Nil_))
That result is how the number 4 is encoded in morte
using a binary numeral system.
@Gabriel439 Thanks for your detailed reply :) . I was hoping @jbenet would have responded by now, but he seems to be busy with things at the moment.
So, in the meantime, let's discuss getting a proof of concept running. The easiest way of doing this would probably be to get morte to be able to resolve ipfs addresses (#/ipfs/Qm.../Prelude/Bool/False
), which could simply be translated to (public) gateway URLs for a start. We can then write a small script to be able to execute computable data stored on ipfs like
$ ipfs-morte /ipfs/...
Compiling...
Result: ...
I can do this without making any changes to the morte
library. The part of the AST that holds remote references is polymorphic, so it can actually hold anything, including ipfs
addresses.
The only thing I'm missing is how to translate ipfs
references to gateway URLs. Is that documented somewhere?
@Gabriel439 You just prefix the paths with the address of the gateway server, for example:
https://ipfs.io/ipfs/Qm.../Prelude/Bool/False
ipfs daemon
is running on your machine): http://localhost:8080/ipfs/Qm.../Prelude/Bool/False
See https://ipfs.io/docs/getting-started/ for more information.
Sorry for the slow turnaround, but I finally wrote up a simple morte-ipfs
like you described:
https://github.com/Gabriel439/Haskell-Morte-Library/blob/ipfs/exec/Ipfs.hs
I was able to upload a Morte expression to a local IPFS daemon (temporarily hardcoding the gateway to localhost
), but when I tried to connect to the daemon using morte-ipfs
the TLS handshake failed with:
morte-ipfs: TlsExceptionHostPort (HandshakeFailed (Error_Packet_Parsing "Failed reading: invalid header type: 72\nFrom:\theader\n\n")) "127.0.0.1" 8080
The error message indicates that the server sent a TLS packet with an invalid contents type (72
, which is not a registered content type). So this is either a TLS protocol parsing error by the ipfs
server or the tls
library. I'm not sure which yet.
@Gabriel439
Alright, I got it working just by using http
instead of https
so that it didn't select TLS. I tried port 5001 for the API route but that resulted in a 403 forbidden error, but the public API on port 8080 worked for me.
What's the difference between the public and private interfaces?
the API route but that resulted in a 403 forbidden error, but the public API on port 8080 worked for me.
is the api client setting origin/referrer headers?
cc @diasdavid @dignifiedquire @davidar thoughts? this is a Haskell api client
What's the difference between the public and private interfaces?
- Public is meant to be exposed, ReadOnly mostly (writable part is opt-in and restricted). for example, https://ipfs.io runs over the global one we run.
- Private gives full access, and restricts the
/ipfs/...
http gateway part to routes explicitly allowed (to prevent exploits).
Alright, I added command-line options to parametrize the morte-ipfs
tool on the hostname and port, and here's an example of successfully building a composite expression and compiling it using morte-ipfs
:
$ # Uploading `(&&)`
$ echo "λ(x : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool) → λ(y : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool) → x (∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool) y (λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → False)" | ./ipfs add
added QmaHXo2mYsdmDf2V35QHVTuUCAjKzwbtd419BWnbmCkUJB QmaHXo2mYsdmDf2V35QHVTuUCAjKzwbtd419BWnbmCkUJB
$ # Uploading `True`
$ echo "λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True" | ./ipfs add
added QmTBfdpJx6MLyxadZLRhG7LYcrmjxgCtWJGef4VxG1mo8v QmTBfdpJx6MLyxadZLRhG7LYcrmjxgCtWJGef4VxG1mo8v
$ # Uploading `False`
$ echo "λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → False" | ./ipfs add
added QmTvtqwNJzNGWUVMnJJErYTKWPjhsVgeN2g9vkmtXu3R8a QmTvtqwNJzNGWUVMnJJErYTKWPjhsVgeN2g9vkmtXu3R8a
$ # Uploading "dynamically linked" `(&&) True False` expression
$ echo "
> #http://ipfs.io/ipfs/QmaHXo2mYsdmDf2V35QHVTuUCAjKzwbtd419BWnbmCkUJB
> #http://ipfs.io/ipfs/QmTBfdpJx6MLyxadZLRhG7LYcrmjxgCtWJGef4VxG1mo8v
> #http://ipfs.io/ipfs/QmTvtqwNJzNGWUVMnJJErYTKWPjhsVgeN2g9vkmtXu3R8a
> " | ./ipfs add
added QmR9XR1F15LQnqB12pcimzySkX9fWBp8Xc43RsFCoopbeZ QmR9XR1F15LQnqB12pcimzySkX9fWBp8Xc43RsFCoopbeZ
$ # Retrieve and compute `(&&) True False` from the public `ipfs.io` server
$ morte-ipfs /ipfs/QmR9XR1F15LQnqB12pcimzySkX9fWBp8Xc43RsFCoopbeZ
∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool
λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → False
The morte-ipfs
command defaults to the ipfs.io
host and port 80
, but you can configure either of them if you want to point to localhost
or port 8080
.
It also just occurred to me that morte-ipfs
doesn't buy you that much. You can technically do this already with a little extra verbosity using the vanilla morte
executable:
$ echo "#http://ipfs.io/ipfs/QmR9XR1F15LQnqB12pcimzySkX9fWBp8Xc43RsFCoopbeZ" | morte
∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool
λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → False
@Gabriel439 this is fantastic!! and very exciting!
I think the morte
executable by itself already is probably good enough for read-only interaction once I figured out how ipfs
addressing works.
For the write path I just need to know what APIs are available for uploading to ipfs
. I can always fall back on the command line API if I need to but I wanted to know what language bindings were available.
I also would like to know how to do web-based integration. That would be the more ideal API since it doesn't require any installation from end users.
Hooray!
@Gabriel439 yes, I agree this doesn't buy us much yet, but the next step is to get it interacting directly with the local daemon (or ipfs.js eventually) instead of going via the http gateway. My (currently half-finished) https://github.com/davidar/hs-ipfs-api might help here
So if you need a Javascript binding to morte
one approach is to compile morte
's Haskell API via ghc-js
, although I'm not sure how easy it is to invoke ghc-js
code from other Javascript code since I've never tried it.
If that doesn't work, an alternative approach is just to port the morte
code to Purescript which has cleaner Javascript integration. morte
's actually a pretty small library (~ 600 for the core logic + an ABNF spec for the grammar) so it should be pretty straightforward to port (especially considering how similar Haskell and Purescript are).
The markup editor @jbenet linked earlier uses ghcjs
Oh sweet, then this should be much easier than I thought!
You can technically do this already with a little extra verbosity using the vanilla morte executable:
yay unixy tools :)
cc @cleichner
(Tangential topic. I may break this out into own discussion.)
It would be really useful to make it easier to write ubiquitous distributed authenticated protocols (via authenticated data structures and say, js). We could do this with haskell libraries and/or on top of morte, and ghcjs. Though we could also see about doing a similar thing with OCaml, to use lambda-auth. A quick googling shows there's several ways of compiling OCaml to JS, but i've yet to look deep enough to understand the relevant security guarantees/constraints. (Or even OCaml -> Morte, if that even makes any sense?)
@amiller are you interested in this? (take a look at what @Gabriel439 has achieved with Morte).
cc @joeyh
I am rereading this thread and Morte seem like an awesome language for the job but until we try it we won't know if it is the right tool for a transformations in IPFS.
Example use cases that I can see IPFS transformation engine to perform:
There are more of them but those are once that I heard people rise interest in them.
I still don't fully understand limitations of Morte but it seems that it shouldn't be a problem and I am just making sure it is so.
Also how hard would it be to re-implement Morte's execution environment in other language (Go, JS)?
morte
itself cannot perform high-efficiency operations but you can embed a syntax tree with placeholders for high-efficiency primitive operations and literals. As a very simple example, you could do:
λ(Int : *)
→ λ(plus : Int → Int → Int)
→ λ(literal0 : Int)
→ λ(literal1 : Int)
→ plus literal0 literal1
If you do that then you're mostly using the language as a code distribution mechanism and not as an execution engine
From comments by @Gabriel439 on ipfs/ipfs#47
I'm creating this issue to help coordinate integrating
morte
with IPFS. Personally I think a language likemorte
would be a great fit for embedding computable data in IPFS.@Gabriel439 Playing devil's advocate, I have a few questions: How does this improve on distributing regular (Safe)Haskell or pure Lisp source code (or BLC) over the internet? What limitations does the lack of recursion have in practice, and is this outweighed by the guarantees it allows you to make about the code? What suggestions would you make in terms of being able to use code that requires unbounded recursion?
CC: @jbenet @ion1