quchen / amoeba

Amœba is a distributed network.
Other
18 stars 1 forks source link

Amœba

Amœba is a program for setting up a distributed network. The name comes from the hope that eventually, the network will be so robust that you can poke any of its parts without endangering its overall integrity.

This is how a small network looks like. It consists of 359 nodes, each trying to maintain 5-10 incoming and outgoing connections. Initially, three independent clusters were created; the last 30 nodes knew of these clusters and joined them together over the course of a minute. Darker colours indicate a more central role of a node in the network (betweenness centrality). Click for higher resolution.

(Picture missing, uh oh)

The current development stage is unstable alpha.

Branch Status
master (Travis image broken)
develop (Travis image broken)

Explanation in simple terms

(This section is for you if you're wondering what the fuzz about the above crumbled up piece of strings and dots is.)

I was always superficially fascinated by complex systems and networks, most notably by what is called emergence: the appearance of complex behaviour in systems made up from simple rules. A single ant does not do anything complex, and neither do ten of them. Put a thousand together though and you will discover that, although each is still individually doing the same things as before, it will amount to something much bigger than what you would have expected from the individual: complex structures of air tunnels or fungus farms. Another example is any living organism: even if you understood how every cell worked exactly, you still would have no idea about whether (or why) putting them together in some way can make up the organism that I am right now, typing this paragraph.

Networks also exhibit a lot of emergent properties, and contrary to living organisms they are much more suitable to being simulated and applied by computers. A network in this sense is simply a number of constituents with connections to other constituents. These networks can consist of people (where conncetions can be "who likes who" or "have met each other at some point"), computers ("connected over the internet", "contains parts made by the same manufacturer"), languages ("what words can be used after others") and many other things.

Amœba is a program that creates a computer network. I came up with the idea around the time of the first Bitcoin boom in 2013; the Torrent network did also seem somewhat interesting to me. So I thought "why not implement a basic version of something like that yourself?" - generously estimating 500 lines of code to get the core done. Months and thousands of lines of code added/removed/edited later, a satisfying first version is still just barely on the horizon, but it's finished enough to be able to play around with it. The "crumbled up piece of strings and dots" above is a snapshot of an Amœba network, a few seconds before I terminated half of it to see whether it would survive that without clustering into many disconnected components. Research has begun! :-)

Planned features

Also see the issues list on GitHub.

Research goals

These goals are subject to certain constraints:

Network description

![(Picture missing, uh oh)](doc/network_schema.png "Network structure of a small system")

The picture shows the network structure of a small Amœba network. Blue arrows are ordinary connections, while red ones stand for local direct connections, used by special network services.

Normal nodes

Special services

A central point in node design is that they reject signals from unregistered origins, so that spamming a single node from outside does not affect the network at all.

However, this is sometimes too restrictive: for some services, it makes sense to be able to issue signals, despite them not being part of the network. To solve this problem, nodes can be spawned with a special direct communication channel that can be used to send messages to it directly.

Bootstrap server

A bootstrap server is the first contact a node makes after startup, and issues edge requests on behalf of its client.

Drawing server

The drawing server's purpose is creating a map of the network to study its structure. Issues a signal that makes every (willing) node of the network send it a list of their downstream neighbours.

Known vulnerabilities and immunities

This is a list of known and feasible attacks on the current design:

Documentation

Client structure

The picture below sketches the flow of information in a single Amœba client.

![(Picture missing, uh oh)](doc/information_flow.png "Flow of information in an Amœba client")

The protocol

The protocol type used by Amœba can be found in src/Types/Signal.hs. All signals are sent downstream, with one exception where relevant data actually flows upstream. Unless otherwise noted, the server answers signals with a ServerSignal, which can basically be OK or one of multiple possible errors. A usual request consists of a node sending a signal downstream and waiting for the response, terminating the worker if it is not positive.

Signals are divided in two main groups, normal and special. Normal signals are what usual nodes routinely use:

Normal signals are filtered: only when they're coming from known upstream nodes they are processed. Special signals circumvent this, as some processes inherently require unknown nodes to establish connections.

Terminology, abbreviations

These may help reading the source comments:

    Name | Meaning

-----------: | ----------------------------------------------------------------- _ | Accessor functions that don't do any computation otherwise. When dependencies permit, the lenses generated from these are used. H | Handler. Signals or commands are delegated to these for processing. BSS | Bootstrap server DSN | Downstream node, i.e. a neighbouring node the current sends commands do. (S, T, U in the picture above.) LDC | Local direct connection. Used by the node pool to send signals directly to its nodes instead of taking a detour over the network. ST1C | Server to one (unspecified/arbitrary) client channel STC | Server to client channel STSC | Server to specific client channel USN | Upstream node, i.e. a neighbouring node the current gets commands sent by. (A, B, C in the picture above.)

Sometimes, I like to use capital letters at the end of identifiers to tag functions with a purpose. This is usually local to a single module. If you see suspiciously looking names like fooX or barL, have a look at the module's head comment.