alexec / fn

Apache License 2.0
2 stars 0 forks source link

Distributed functional language #1

Open alexec opened 2 years ago

alexec commented 2 years ago

An idea. A purely functional language that is distributed for both build and execution. There are only two concepts:

Both types and functions are universally reference-able. I.e. they are a URL to a file. This means each file only contains one item, a function or a type. To run a program, I just need the URL. The runtime downloads the program and runs it.

We’re talking dependencies to the level of files.

Naturally, there need to be some “core” functions. E.g. println

l0st3d commented 2 years ago

An idea. A purely functional language that is distributed for both build and execution. There are only two concepts:

  • types which is basically JavaScript types.

Need more types than JS. JS doesn't have integers, for example. Also, I think it's preferable to distinguish between names and values by type. So don't default to strings to refer to keys in a map, provide a different type that's basically a pair of interned strings (namespace and name). Also to make a sane functional programming env you need to provide immutable values and collections. Really what you need is data with an algebraic model for combination to get around the expression problem.

  • functions which are functions.

Do you want to distinguish between pure functions and unpure functions statically?

You also need a transaction system. Haskell has the state monad, Clojure has it's STM, Erlang has OTP. Then there's Kafka and it's model. There's no reason that you couldn't just use the kafka data model of topic & partition-keys and serialised event sourced streams to design systems from the inside out. So have some in-memory/local implementation. Then when you need to scale it, you can deploy it to kafka/kinesis/whatever without a rewrite. Or there's datomic, which has it's transaction model and a rich query language that rivals sql for expressiveness without the broken transaction model. Which is kinda approaching the problem from the other end of the system.

Both types and functions are universally reference-able. I.e. they are a URL to a file. This means each file only contains one item, a function or a type. To run a program, I just need the URL. The runtime downloads the program and runs it.

We’re talking dependencies to the level of files.

Naturally, there need to be some “core” functions. E.g. println

I think files is the wrong level of abstraction to work with. Files don't always exist. They're really streams of bytes with a leaky abstraction on top. If you want a distributed computing platform, you need to design for the lowest common denominator. Maybe an AWS lamda or something, or a restricted process that can't access the filesystem or the network and just comes up with a load of code concatenated into it's process. Files are useful for storing things in databases like git, but not for tree shaking or dependency tracking. With content addressable functions, you could dedupe whole codebases automatically.

You could imagine designing a language and tools to facilitate working with this sort of codebase. So maybe the compiler writes directly to the git db? If you compile a function, it's written directly into the .git dir as a commit, and you could have tools that give you a view of that code. So the compiler could strip out local names before storing the algorithm, and you could have tools to extract a view of the code to work with it. So, maybe you could run a command and paste in a stack trace to "checkout" a slice of the code, which you'd see in a single file. Then as you're fixing your bug, you recompile just the functions that you change and the new functions get stored as commits. Everything's immutable. You can rely on git's gc process to cleanup after you've merged branches if you care about that. You'd need a way to strip out local naming information and recover it. Also to run querys on your codebase like "what other functions depend on this function" and so on.

l0st3d commented 2 years ago

"core" functions can just be resolved like everything else, with the global dependency resolution mechanism

alexec commented 2 years ago

For types, I just wanted a simple starting set, because having read a few LLVM articles, LLVM is hard. Wanted to keep it simple.

For functions, I’d prefer them without side-effects. Yet, I want to do disk and net I/O. A logging function would, for example, to append a line of text to a file. That’s neither idempotent or stateless.

I think there’s something in uniquely referencable function. It’s one of the best parts of Golang, you always know where to find source code, because all dependencies reference Git repos. Content addressable does not “address” that issue.

l0st3d commented 2 years ago

You don't /have/ to have a type system to say that a function is pure. If you provide a restricted lang that doesn't allow mutation, then you know everything that depends on that must be pure, right? So you could mark functions as not pure based on their dependencies, and use things like namespaces or tree-shaking separate out the pure from the impure ...

l0st3d commented 2 years ago

git is content addressable ... I think that a content addressable db of functions should always be able to resolve the src