MinaProtocol / mina

Mina is a cryptocurrency protocol with a constant size blockchain, improving scaling while maintaining decentralization and security.
https://minaprotocol.com
Apache License 2.0
1.99k stars 528 forks source link

Replace all YAML template expansion (and future config) with Dhall (proposal) #4815

Closed bkase closed 2 years ago

bkase commented 4 years ago

A few engineers have been thinking that Dhall makes a lot of sense for O(1) Labs and Coda. This document makes these thoughts explicit and sharable more broadly.

Intro

Configuration languages typically lack any form of abstraction — to make snippets reusable we have to resort to template expansion. And many are poorly specified and have strange edge-cases on top of that (looking at you YAML). At O(1) Labs, we currently rely on YAML and an assortment of template expansion languages: jinja2 for our CI, and modified Go-template expansion for our deployment system. These template expanders have limited expressibility and are hard to test. Due to YAML's whitespace sensitivity, the template-based abstraction tools are very brittle and hard to use.

"Dhall is a programmable configuration language that you can think of as: JSON + functions + types + imports" quote from the Dhall website . For the functional programming enthusiasts, you can think of it as literally "System F-omega with records" and a little syntactic sugar on top. In other words, it's a well-designed programming language rather than syntactic sugar for string replacers (as typical template expansion languages are). Dhall compiles into YAML or JSON (and Nix and Bash and a few other things) in the same way are existing template expanders compile our templated-YAML into YAML.

Benefits of using Dhall

Costs

Tradeoff Analysis

These tradeoffs are typical of the sorts of tradeoffs we tend to make on our engineering team as we value using tools that can give us more safety over those that don't even if it means picking systems that are harder to learn (examples: OCaml, using GADTs in OCaml, etc)

yourbuddyconner commented 4 years ago

Dhall PoC for Infra as Code: https://github.com/CodaProtocol/coda-automation/tree/dhall/dhall

An Example of how you could include Dhall into a Helm Chart: https://github.com/CodaProtocol/coda-automation/pull/364/commits/2060d16f3f9dfc90d48f46cb3038b80197884d6e

nholland94 commented 4 years ago

Using Dhall to generate YAML and JSON will be easier than using OCaml or another statically typed programming language for the same task. One immediate reason is that there is no serialization logic that needs to be done in order to perform this translation between Dhall values and <insert supported format here> values. OCaml has deriving, but often times, the deriving format is not what we want and needs to be customized (see for instance the way that ppx_deriving_yojson handles ADTs, and compare that to how variants are actually handled in the real world with YAML and JSON configs). That already reduces the complexity heavily. Further more, there is no work needed to be done to deserialize these formats either, and the deserialization happens natively into the Dhall type system, so no dynamic checks need to be applied like they would be in OCaml to convert to the actual type.

Another point of why Dhall is a better choice than OCaml for this task is that Dhall is not turing complete. It's lack of turing completeness gives you a number of necessary guarantees with configuration generation that normally need to be enforced in a manual review process when working with turing complete languages.

Finally, Dhall also has a record typing system that easily allows recursive merging of record types in a way that is more familiar to dynamic language operations for merging records. OCaml has not analogy for this, and it would make for a lot of verbose record redefinitions with different scoping of fields. In the end, records need to look the same as what we want to serialize in YAML or JSON, so having the ability to manipulate records like this up front goes a long way in terms of getting the data into the right shape without having to shift values around constanly.

yourbuddyconner commented 4 years ago

In general, I support this idea -- I think Dhall solves the problem of dealing with the inherent complexity of producing yaml configs for various infrastructure things.

It fits nicely into the Helm ecosystem, enabling us to get rid of the annoying go templates while providing a flexible way to establish patterns that make it easy to reason about complex configurations.