mokuki082 commented 6 years ago

The artifact specification is intended to represent the backing artifact that implements the protocol specification. For now we have 2 options and this depends on the subtrate that we want to support. Given that we have specified our subtrate for now to be 64 bit intel linux running NixOS, the 2 artifacts are Docker/OCI Container Images and Nix Archives. This model should be extendable to include other artifact formats (like the Java stuff).

Docker Image Specification

Previously Docker has bee using V1 image spec. By using docker save | tar -x on the an ubuntu image, we can see how the V1 spec is structured. (Currently V1 spec is only used in docker load and docker save, but not in the actual deployment or processing of the image itself)

There are a few problems:

The manifest.json file in the root of the image describes the layers hierarchy already, the layers doesn't need to have a specific json file that also has the hash of the parent layer.
Each layer contains some attributes such as entrypoint, cmd, domain name, which should be container specific but not layer specific.
The image ID of the container used to be a randomly generated string, rather than a content address.

Now, Docker is aware of these issues above and designed a V2 spec. The V2 Spec adopts the OCI image specification and is a lot more sophisticated. It's description is as follows:

mediaType is introduced to indicate the types of each configuration component. Some mediaTypes in Docker are interchangable with the mediaTypes in the OCI spec (See Compatibility Matrix).
An image index (or a manifest list in Docker's terms) is introduced. This contains a list of images which serves the same purpose (e.g. They are all Postgis images), but for different host architectures and OS. So far we are not too worried about this, because our prototype only needs to deal with x86 architecture and a recent version of NixOS, but this might need to be taken into the consideration for the Artifact Spec for scalability purposes.
Now we also have image manifest, which describes some metadata relating to the image (such as the image configuration hash), and most importantly a list of layer objects which contains the hashes of each layer. The changesets describing each layers are compressed into a tar ball, the digest mentioned are took from the canonicalised version of the tarballs as described in Docker V2 HTTP API: Digest Header.
Upon pulling a new image from an untrusted registry, the hash of the image manifest is used to compare with the hash in a trusted registry. The image manifest contains all the layer and container runtime information in a canonical way. Upon reading/processing each layer, the hashes in the image manifest is further validated with the content of each layer.
We can also pull an image from the registry using the content hashes obtained earlier. The downside of this is that we might also have to know the repository related to the hash, as it is formatted in such: ubuntu@<content_hash>. This disallow us to refer to a container solely by its content hash. We might have to implement out own image registry if we want to do this differently. (How to Pull an image by digest.)
Note: There is a difference between the image digest that is pulled from the Docker registry, and the image ID itself. According to this blog post and this github issue, the differences seems to be:
```
Pulled digest   = SHA256(ImageManifest{compressed(layers)})
Docker image ID = SHA256(ImageManifest{layers})
```
A fuller hash verfication detail could be found in the OCI descriptor specification. However this doesn't really mention what is being hashed but instead the way to verify a hash.

Relation to Artifact Spec

The Artifact Spec primarily deals with the image spec.
Image Index or Manifest list contains multiple containers for different platforms. However I think that this is something that we should be able to resolve transparently by detecting system information when an artifact configuration is interpreted, rather than requiring the operator to input the system information for all the nodes that this container may be deployed in. This is more of an implementation detail for us (ultimately we will have to support more platforms) rather than a feature of the architect language.
We might have a central server for different artifacts (images), and include Nix Artifacts (Image configuration from Nix) and Docker artifacts (Image configuration from Dockerfile)
Dockerfile offers multi stage build where previously made images can have its some of its contents copied from into a later made image, which reduces image size by a huge amount and avoids complex hooks.
Containers are tested in a automatic and continuous manner, while accepted containers will be pushed into an external registry.

Relation to the underlying implementation

The image pulled from an untrusted source needs to be verified with the expected hash from a trusted source. This will involve some secret sharing system and a peer to peer feed in our case (I am not entirely familiar with all these terminologies).
Using Docker's HTTP API to fetch images may require us to know the repository of the image as well as the content address. (e.g. ubuntu@<hash>) This could be solved by making our own registry, however further studies are needed to know the complexity of this problem. (Potentially we might also be able to use ipfs instead of container registry)
The entry point of the our project (runtime of containers) would be using RunC/libcontainer. RunC is a CLI for container creation/destruction at a lower level, libcontainer is the library that it uses to create OCI complaint containers. It contains simple mechanisms to initiate containers, but without docker's orchestrator and monitoring on the containers, which is exactly what we want.
Sometimes having a full container is not necessary for the use case. This could make our container instantiation more efficient.
Docker is not the only OCI runtime available. There is also OCI Runtime in Rust and other C implementations, which we can potentially bind into.

Next Step

~OCI Runtime specification is needed to understand how to run the container from the orchestrator.~ DONE
Nix papers. [Creating docker container from Nix]
CI/CD with Docker Cloud. This will be explored further and compared with Nix's methodology (using Hydra), and seeing how Matrix could implement our own CI/CD system.

CMCDragonkai commented 6 years ago

Make sure to talk about reproducibility. Nix vs Docker. The CI/CD system and staging of the artifacts. Also fixed output derivation. @mokuki082 Can you add in all the links I sent to you.

CMCDragonkai commented 6 years ago

https://github.com/NixOS/nix/issues/296

CMCDragonkai commented 6 years ago

While investigating the Artifact Specification. We are comparing Docker Artifacts vs Nix Artifacts:

https://nixos.org/~eelco/pubs/hotos-final.pdf - Finished reading this
https://nixos.org/~eelco/pubs/phd-thesis.pdf

CMCDragonkai commented 6 years ago

The Artifact composes several ideas:

The executable
Runtime dependencies of the executable (ideally buildtime dependencies have been stripped)
The initialisation routines

With regards to initialisation routines, there can be complexity when the artifact specifies not a single executable, but a group of executables that is executed under an init process. Sort of like supervisord. Think erlang supervisor processes.

If we take Docker containers as an example. Often the final image layer has the injection of a the file entrypoint.sh. This file contains the initialisation routine to launch the image.

What the executable might then do, is bind to ports (if they are networked application). Or perform operations on a filesystem path (if they are batch application).

The V1 spec, specifies what ports are available to be exposed by the image. So if the container has MySQL, then the V1 spec specifies that 3306 is exposed. Or if it is a postgres container, there is metadata that says that 5432 is exposed.

This information is important, because these ports are often fixed "hardcoded" within the image layer. If this is true, we can't change it at the launch stage.

Preferably these port bindings would not be something that is hardcoded into the container. And we could dynamically inject these parameters into the launching phase. Let's think of the launching phase of a function. This is the difference between a bound parameter, and a hardcoded value. For example see this: https://docs.docker.com/engine/reference/builder/#expose (The Dockerfile is the build expression for a Docker container).

It seems that most containers do hardcode the port expose. If this is true, port expose is a property of the Artifact, and thus must be understood by the Orchestrator when it wants it to an IP address.

Note that it appears that port bindings by the internal application tend to be hardcoded by the Docker community simply because they expect that the user of the container will do port mappings.

Implementation wise, if every Automaton is given their own IP address. If Automaton A requires Automaton B. And B exposes port 8080. This means A's address to B must include the port mapping. A shouldn't care exactly what port it is. But the relay system must map A's address to that port. If the port expose is a property of the artifact, this means while the Orchestrator/Relay can choose to use a random port on A's address, it must eventually map to the specified port that has been hardcoded into the image. Now what do we actually mean by hardcoding? Most network executables allow you to specify the port as a command line parameter. However the port parameter may be fixed internally an entrypoint script. Or even by the dockerfile. @mokuki082 Can you check how OCI spec considers this, and where is the entrypoint/cmd information stored in the new container specification. If this is not part of the "image", our Artifact Specification will need to address it some how.

There's an important considerations to entrypoint/cmd: https://www.ctl.io/developers/blog/post/dockerfile-entrypoint-vs-cmd/ Basically we should always be using the exec syntax, not the shell syntax.

Note that the TCP/IP model is not the same as the OSI model. They are different. For more info see: https://tools.ietf.org/html/rfc3439#section-3

If we are utilising containers without the Dockerfile. Then I suspect that the container format will need some metadata that indicates what things are configurable and what things are not. If the internal port is configurable, that would be nice, but we cannot expect this. So we have to work around the lowest common denominator.

CMCDragonkai commented 6 years ago

Consider the problem of whether the Artifact Specification species a fixed output derivation ("Nix nomenclature") or the actual instructions on how to build the artifact.

There's a difference between:

build {
  echo "abc" > here
  compileSomeCCode
  writeSomeLinesHere
  storeThisThingInto > $out
}

vs

fetchUrl {
  url = "https://blah.com";
  sha256 = "sdf89sdfu893e4..."
}

Both expressions express reproducibility or at least immutable content addressing. The first gives you instructions on how to build the artifact. This can be utilised by the CI/CD system to build the artifact for the orchestrator to deploy. In fact the orchestrator just needs the artifact in the build cache. If it doesn't have it, it triggers the build instructions. The other expression is slightly different because the build expression is not self-contained. It relies on something external that is assumed to exist. It's sort of like a magic value.

Assuming the upstream sources are reliable (either by using IPFS) or otherwise.

What does this mean for the Artifact Specification? Should the Artifact specification have actual build expression encoded in it? Or should it just point to the fixed output derivation hash of some external build system. Like Docker or Nix?

We can be flexible here. As long as it meets the abstract requirements of "reproducible builds" https://reproducible-builds.org/. The Artifact Specification can either embed an entire build expression. Or it can point to an external build file. But we have to be careful here, pointing an external file is a source of mutability. What if that file got changed? We can operate similar to Nix here, where the evaluation of this source, considers the contents of the file at that point, and saves the output with the hash of all the inputs. If that file changes, then the nix expression will produce a different output hash. But that is problematic because other Automatons may be composed with Automatons, and Automatons are all addressable via content hashes. Which means as soon as that file changes, the composing expressions will no longer work (if they were using hashes). So... instead perhaps relying on external files must use a fixed output derivation. Just like how nix refers to external sources via a fetchurl expression and a output hash.

A = Automaton {
  artifact = Artifact {
    build = DockerFile './Dockerfile'
  }
}

B = Automaton {
  artifact = Artifact {
    image = DockerImage 'abc123...'
  }
}

-- we can imagine DockerFile as sort of constructor that construct a value of type ArtifactBuild
-- anything that satisfies ArtifactBuild would be some sort of instructions that is intepretable by the CI/CD system
-- alternatively an image must be something interpretable by the CI/CD system as well, but we do not expect it to represent instructions to execute
-- either way the result is that Automatons are reproducible

The problem comes with using DockerFile. How can we ensure this maintains compatibility with content addressing? @kneedler How do we deal with the fact that the underlying Dockerfile could be changed. Which might induce a change on the Automaton A.

We also have to be concerned with DockerFiles that are not reproducible, because they may have aspects that are impure. Like network downloads. This is the same problem that nixpkgs community faces when they are given packages that have build scripts that perform impure operations or rely on impure properties. Usually this means the nixpkgs community rewrites the build expressions or edits them to make them pure. Can we detect impurities statically? Not in a guaranteed manner. But we can again rely on Nix's way to do it. They use a pure "container" to run builds, and if the builds fail in their CI system (Hydra), then it is rejected by the Nix community. See nix-build.

mokuki082 commented 6 years ago

I think that the configuration that an automaton depends on (e.g. Dockerfile) should not effect the addressing of the automaton, but the output of that configuration should. In the end, all we care about is the output of the configuration, rather than the configuration itself. For example, just because we added a newline in a dockerfile, shouldn't require us to change the automaton's address, and even all the other automatons that interacts/depends on this automaton.

In the case of the dockerfile, we might want to compare the produced OCI image manifest rather than the dockerfile itself, and in the case of nixos, we would care about the hash of the actual package being installed in the system rather than the build script itself.

CMCDragonkai commented 6 years ago

If we imagine that A = Automaton... is actually hash of the declarations, and then there is separate hash for the artifact output. It is conceivable that that 2 different declaration hashes can have the same output hash. This is because one might change the expression, but the resulting artifact is still the same. This could happen intentionally such as through further protocol specification. But it can also happen unintentionally if someone where to add an extra empty line to the build expressions (if the build expressions were an opaque shell script). Is this design still feasible?

CMCDragonkai commented 6 years ago

One way to deal with this hash and change problem is to have a composition of hashes. Even if the Automaton hash changes, if the Artifact hash doesn't change, then we can reuse the built artifact, rather than rebuilding speculatively.

CMCDragonkai commented 6 years ago

Perhaps these should have hashes as well, since the ./Dockerfile is just a local path compared to https://someremotefile. Fundamentally both are equally impure. Hence if remote resources need to be hashed. Surely local resources need to have a hash as well.

A = Automaton {
  artifact = Artifact {
    build = DockerFile './Dockerfile' 'sha256:...'
  }
}

CMCDragonkai commented 6 years ago

Just had an idea today about this. Certain artifacts may require certain hardware properties in order to run. This is also a constraint on the deployment, and which node can supply the resources required. For example an Automaton may require access to the GPU. These properties may not be recorded in the Artifact image. But instead recorded in the Artifact specfication just like the Image Index in the OCI spec. The Image Index lists multiple artifacts, one for different CPU architectures. But we could do something for GPU architectures and other hardware properties. I'd prefer something that would be a composable list of constraints, rather than magic strings like "x86 with CUDA GPU... etc", this may be mapped to Node tests or however we are supplying the Nodes.

mokuki082 commented 6 years ago

So far we've had some ideas on how the artifact component could look like. The two main languages that we are considering to support are Dockerfile and Nix expressions.

Dockerfile

A = Automaton {
  artifact = Artifact {
    build = Dockerfile <filepath> <content-hash>
  }
}

Problems:

Impurity - The FROM clause in Dockerfile may download containers without content addresses.
- Solution: Consider Nix's way of solving this issue: using a pure container to run the build at CI stage, if it fails then it is rejected.
Content addressability - FROM ubuntu:latest and FROM sha256@123123... might give us the exact same image, but if we hash the Dockerfile, we may consider them as entirely different things.
- Solution: Hashing a configuration is hard if there's no canonical representation of it. but if we can interpret the Dockerfile and put it into OCI image configuration, then the content would be a lot more reliable to hash. However, this introduce a huge time tradeoff, producing the OCI image configuration may require downloading base image from a image registry, which is a costly operation. I'm not sure we want to do this at the configuration declare stage.
Modularity - I think that there could be a way for an Artifact to inherit certain aspects of another Artifact, namely the image. Dockerfile provides some way of refering to an existing image as the base image, but again this is hard to Matrix to know without parsing the Dockerfile and obtaining the content hashes of the base image, and due to the content addressability issue mentioned, Docker's way of referring to the base image is not really what Matrix wants.

Nix Expression

Nix has a special API dockerTools which allows creation of Docker images. Of course we probably will be looking into writing our own API because dockerTools is quite limited and unstable, but it is a nice entrypoint for me since I haven't got much experiences in writing Nix expressions. I'll investigate this into more details.

CMCDragonkai commented 6 years ago

We can do more than just nix expressions that create a container. It may eventually be possible to use any old nix expression. According the image spec, a container is just series of filesystem layers unioned together. It is not necessary for that to be the basis of an Automaton, it's just fashionable right now. Using nix expressions opens up other avenues, such as ISOs, VMs, unikernels, plain executables... etc. But yes for now we shall focus on Docker/OCI containers.

mokuki082 commented 6 years ago

With regards to using nix expressions for artifact spec, when nix-env builds a nix expression, the nix expression is recursively translated into store derivation. Store derivations contains only sources from the nix store, this means that every component in the store derivation is content hashed, hence we can address the store derivation by the hash of its content, which is enough to uniquely identify the context of this derivation.

If we want to address the nix expression used to generate an artifact, we could hash the store derivation generated from the nix expression.

CMCDragonkai commented 6 years ago

We may need to generate nix expressions. Like conversion of Architect constructs to Nix constructs, and other data structure or on-disk nix expression manipulation.

To do this, we can look at how dhall-nix works (https://hackage.haskell.org/package/dhall-nix). Notice it depends on hnix (https://hackage.haskell.org/package/hnix). It calls itself a haskell implementation of Nix.

Alternatively consider: https://hackage.haskell.org/package/language-nix

Also can you try building a docker/OCI image directly using just nix expressions?

What should we use as our execution language? I'm guessing scripting language as that's usually what is used. But I'm wondering if our Architect specification can have a cross language quasiquoter, and just allow direct specification of Nix or whatever build language that is deterministic.

CMCDragonkai commented 6 years ago

As a clarification, I'm referring to grammar composition (this is often a useful thing when having embedded external DSLs). See: https://github.com/atom-haskell/language-haskell/issues/88 for usage in an IDE context. Also: http://lambda-the-ultimate.org/node/4489

It would be nice to be able to even refer to a nix expression file written outside and deal with that somehow. But I still need confirmation on how to make the entire thing content addressed when the very fact of importing an external file is already non-deterministic (as it involves IO).

I'll need to investigate how more about how the GHC system does quasiquotation and maybe we can lift some features out of it.

mokuki082 commented 6 years ago

Just to clarify what I meant with an example, if we write a nix expression that import sources such as "./builder.sh" or "http://source.html". This is not deterministic because the content of "./builder.sh" could change the next time you run the same nix expression, which means the expected output could changed even though the nix expression remains the same. Therefore the hash of a nix expression is not enough to be used as a reference of a particular derivation. But what we can do, is that we can recursively generate a store derivation on all sources and dependencies used in the nix expression, which produces a content hash of all the sources mentioned in a derivation in a format like /nix/store/hashhashhash...hash.drv, and replace the mentioned source with this particular store path in the top level nix expression, then finally after all sources are replaced by a store derivation path, we calculate the store derivation path of the top level derivation, the produced store derivation will be the resulting identifier for this derivation. This way the derivation is content addressed and reproducibility is ensured.

CMCDragonkai commented 6 years ago

The new nix 2.0 adds some new features and mentions this interesting fact:

Pure evaluation mode. This is a variant of the existing restricted evaluation mode. In pure mode, the Nix evaluator forbids access to anything that could cause different evaluations of the same command line arguments to produce a different result. This includes builtin functions such as builtins.getEnv, but more importantly, all filesystem or network access unless a content hash or commit hash is specified. For example, calls to builtins.fetchGit are only allowed if a rev attribute is specified.

The goal of this feature is to enable true reproducibility and traceability of builds (including NixOS system configurations) at the evaluation level. For example, in the future, nixos-rebuild might build configurations from a Nix expression in a Git repository in pure mode. That expression might fetch other repositories such as Nixpkgs via builtins.fetchGit. The commit hash of the top-level repository then uniquely identifies a running system, and, in conjunction with that repository, allows it to be reproduced or modified.

Seems that there are different evaluation modes that can produce certain useful properties.

mokuki082 commented 6 years ago

Nix store paths are composed purely of its input, and is not the output of the build. This can be demonstrated by the fact that nix store derivations are made before the build process.

However, this introduces a possibility of duplicating outputs. In Nix, this is usually not a problem because all its packages are from a centralised server, i.e. Nixpkgs. If someone writes two nix derivations that produces the same output, they are likely to be told something that this package already exists. But for a distributed system with multiple operators, this may become problematic if the operators are not communicating to each other and write different expressions that produces the same output, which causes duplicate data.

mokuki082 commented 6 years ago

After some investigation on how nix generates store paths, here's what I found (in short):

Input Path

For each input file that is external to NIx Store, generated the hash h using the NAR serialisation of the input file.
s := "source:sha256:$h:$store_path:$filename"
inputPath := BASE32(TRUNC(SHA256(s)))

OutPath

Before the final store derivaiton hash is computed, Nix computes the output path of where the final output is going to be built, i.e. the value in the$out variable.

Set the outpath attribute to empty string
Take the textual ATerm format of the store derivation up to this point.
Take the SHA256 hash of the format, let it be h.
s := output:out:sha256:h:/nix/store:foo
outPath := BASE32(TRUNC(SHA256(s)))

Fixed-output Derivation

A derivation can take three special attributes: outputHashMode, outputHashAlgo, and outputHash.

outputHashMode can be flat or recursive, recursive means to take the NAR serialisation of the intended file, flat otherwise.
outputHashAlgo is the hash function used, e.g. sha256 etc.
outputHash is the known hash of the output.

If a derivation contains these special attributes, a special s will be calculated as such:

s := "fixed:out:sha256:<outputHash>:"
s := "output:out:sha256:SHA256(s)"

Then we compute the final path just like step 3 in the first section.

outPath := BASE32(TRUNC(SHA256(s)))

Final Derivation

-- to be updated.

MatrixAI / Architect

Artifact Specification #8

Docker Image Specification

Relation to Artifact Spec

Relation to the underlying implementation

Next Step

Dockerfile

Nix Expression

Input Path

OutPath

Fixed-output Derivation

Final Derivation