Suggestion for project structure

I apologize in advance for the length of this suggestion! I haven't fully looked into your tool yet, but your project structure looked similar to one I use, so I thought I'd share it here.

Summary

This is a suggestion about project structure. Based on my experience, it seems like a good balance of how to structure virtually any infrastructure-as-code in a VCS.

This model allows me to manage anything from one specific set of infrastructure, to a large mono-repo of many products, regions and cloud providers. I'm sure it's not perfect though, so I'm curious to hear where it can be improved!

Motivation

I spent a lot of time working and re-working on project structure over many different IaC projects, and so far this makes the most sense for everything I have worked on. It's not perfect, but I think the principles behind it make it intuitive and help it resist unnecessary complexity.

Some of the problems I faced while developing this structure:

When I need to "just get something done", I do not want to be forced to traverse 50 directories and read all the code just to figure out how to change a single variable's output somewhere.
My expectations should match reality. If I want to deploy only one thing, it should not break unexpectedly because somebody changed something else.
How do I minimize the sprawl of components across a VCS that can happen in large projects?
How can I keep my code DRY without it becoming annoying?
Can I separate different environments' configuration logically?
Can I dictate how the structure should be used to prevent these problems?
Will anyone other than me understand it?

Guide-level explanation

There are three major directory hierarchies, each with its own philosophy of how they are to be used.

$ tree
.
├── env
├── apps
└── libs

3 directories, 0 files

`env/`

When you run any program, it's running in an environment. It may be your IBM Laptop running Windows 10 on your home network in Northern Virginia, or a Debian virtual server running on an ARM chipset on AWS in China. The same program may run in both places, but how it runs will change based on its environment.

The env/ directory is the "environment" directory, or configuration directory. This is where environment-specific configuration lives - which is to say, any configuration that ever changes between any environment.

In this directory we name a new directory after our environment. This name can be incredibly generic, such as "aws". Or it can include more information, like the account name, and the product family name. But it shouldn't get too specific, because of the next bit of this hierarchy. So, we'll give this first directory a name with where the infrastructure is hosted, the product family name, and an indicator about which general environment this is:

.
├── env
│   └── aws-acmecorp-nonprod
├── apps
└── libs

4 directories, 0 files

Now, this might be enough for a lot of people. But if you think you might end up getting more specific with your configuration, you can deepen the hierarchy. The longer your infrastructure sticks around (and grows), the more likely this will be.

Let's say you want to deploy the same basic infrastructure to two regions: us-east-1 and eu-west-1. Sounds simple enough, let's just make two new directories! But at a certain point you realize that not all AWS resources are region-specific. Some are global (like IAM, or Route53). If I put that infrastructure configuration in one region's directory, I always have to deploy that region just to get the global changes I wanted. So to make it easier to only deploy the global changes, we make a third directory.

$ tree
.
├── env
│   └── aws-acmecorp-nonprod
│       ├── global
│       ├── eu-west-1
│       └── us-east-1
├── apps
└── libs

5 directories, 0 files

Now let's say that over time, your infrastructure is growing. You have a lot of Route53 resources and it takes a while to terraform apply. You'd like to be able to just deploy the Route53 changes and nothing else. But it's annoying to have to use the taint or -target options to Terraform. So we create a few more directories, and these will each end up being its own Terraform root module & remote state. (The modules won't live in these directories, though; more on that later)

$ tree
.
├── env
│   └── aws-acmecorp-nonprod
│       ├── eu-west-1
│       ├── global
│       │   ├── iam
│       │   └── route53
│       └── us-east-1
├── apps
└── libs

9 directories, 0 files

That's our basic configuration hierarchy! Now, what's the philosophy of the env/ tree?

The main rule is: no code, only configuration. If you have some code used to generate, parse, or load configuration, it should not live in the env/ directory. This keeps the scope of what's in this directory tighter. Any code will go in the other two directory hierarchies.

The second rule is, you should not have inter-dependencies on configuration outside of a single path. If you have configuration in env/aws-acmecorp-nonprod/global/iam/, it should not refer to configuration in, say, env/aws-acmecorp-nonprod/eu-west-1/. The reasoning there is pretty obvious: if you change something in one region, you don't want it to accidentally impact something in a different region.

If you really need to refer to configuration somewhere else, remember that you are essentially referring to external state, and that you can't necessarily expect how it will behave. Within the context of Terraform, you would typically use a terraform_remote_state data source to pull outside configuration at run-time. We assume each environment will have its own Terraform state for the basic principle of reliability: if one environment goes down, and the other environment depends on it, we're in trouble!

But does that mean you can't re-use configuration? Not at all; you can inherit configuration from parent directories. For example, maybe you'll always want some specific tags to be applied to all your infrastructure, no matter where it lives. You can put that configuration in a parent directory and refer to it when you run terraform.

$ tree
.
├── apps
├── env
│   └── aws-acmecorp-nonprod
│       ├── eu-west-1
│       ├── global
│       │   ├── iam
│       │   └── route53
│       ├── terraform.tfvars.json
│       └── us-east-1
│           ├── override.tf.json
│           └── terraform.tfvars.json -> ../terraform.tfvars.json
└── libs

9 directories, 2 files

You can actually do this in a couple ways. You can use symbolic links and have Terraform automatically pick up the inherited configuration:

$ pwd
/home/vagrant/foo/env/aws-acmecorp-nonprod/us-east-1
$ terraform plan

Or you can explicitly reference each configuration file:

$ pwd
/home/vagrant/foo/env/aws-acmecorp-nonprod/us-east-1
$ terraform plan -var-file=../terraform.tfvars.json -var-file=override.tf.json

Or you can use a small script or Makefile to generate inherited configuration on the fly. I do not recommend this behavior, though. A bug in your script could cause your configuration to be generated improperly and cause unexpected results. It's much better to use static configuration that can be peer-reviewed & tested and won't accidentally change later.

$ pwd
/home/vagrant/foo/env/aws-acmecorp-nonprod/us-east-1
$ tmp=`mktemp`
$ jq -s '.[0] * .[1]' ../terraform.tfvars.json override.auto.tfvars.json > $tmp
$ terraform plan -var-file=$tmp

Now that we have an understanding of the env/ hierarchy, let's move on.

`apps/`

We all understand the basic principle of an application: it's a collection of code that you execute somewhere. Often, you feed it configuration, and input, and you get output. But the core code of the application doesn't need to change based on where or how you run it. It's a complete, usable tool, that probably still needs to be told what specifically to do with it.

That's what each directory in the apps/ hierarchy basically is. Each subdirectory is an "application": a complete unit of working, executable code.

The philosophy is similar to before:

This directory should be 95% code, and 5% default configuration.
An apps/ directory should never depend on another apps/ directory.

The reasoning here, like before, is to compartmentalize the purpose of this directory to just be "an application". A pre-built, tested, working, individual application with as few external dependencies as possible. This makes it easier to reason about how it works, which makes it easier to maintain, and makes it more reliable.

If you're wondering: "Wait, few external dependencies?", don't worry. You can still load reusable modules from anywhere you want - particularly, from the next directory hierarchy. The main thing is, don't load or call anything directly (e.g. using relative paths) from a different apps/ directory.

In the context of Terraform, an apps/ directory is a root module. It includes your backend, your providers, your variable definitions, and loads modules. You run Terraform in an apps/ root module, passing in configuration at run-time.

$ pwd
/home/vagrant/foo/apps/aws-infra-region
$ terraform plan -var-file=../../env/aws-acmecorp-nonprod/terraform.tfvars.json -var-file=../../env/aws-acmecorp-nonprod/us-east-1/override.tf.json

But isn't that a bit complicated or error-prone to run? Well, it might be, except that we're not going to run it this way in practice. Instead, we create a Makefile in our environment directory.

$ tree
.
├── apps
│   └── aws-infra-region
│       ├── backend.tf
│       ├── modules.tf
│       ├── override.tf.json
│       ├── providers.tf
│       └── variables.tf
├── env
│   └── aws-acmecorp-nonprod
│       ├── eu-west-1
│       ├── global
│       │   ├── iam
│       │   └── route53
│       ├── terraform.tfvars.json
│       └── us-east-1
│           ├── Makefile
│           ├── override.tf.json
│           └── terraform.tfvars.json -> ../terraform.tfvars.json
└── libs
$ cd env/aws-acmecorp-nonprod/us-east-1/
$ pwd
/home/vagrant/foo/env/aws-acmecorp-nonprod/us-east-1
$ make plan
ENV=`pwd` && \
cd ../../../apps/aws-infra-region/ && \
terraform plan \
        -var-file=override.tf.json
        -var-file=$ENV/../terraform.tfvars.json \
        -var-file=$ENV/override.tf.json
terraform plan -var-file=override.tf.json -var-file=/home/vagrant/foo/env/aws-acmecorp-nonprod/us-east-1/../terraform.tfvars.json -var-file=/home/vagrant/foo/env/aws-acmecorp-nonprod/us-east-1/override.tf.json

As you can see, we can now keep our configuration separate from our reusable module, and run a deployment on a specific environment, without needing to remember anything other than the directory and make plan.

You can even add more reliability & simplicity here. Just have your apps/ root module load the AWS account ID and region from variables, and keep the values in your configuration files: aws_account_id in your terraform.tfvar.json file, and aws_region in your override.tf.json file. If for some reason you accidentally call Terraform with AWS credentials for the wrong region, Terraform will die complaining about not being able to access the right region.

And of course, each apps/ sub-directory can have its own testing/ directory with some sample configs and a Makefile.

The next and final hierarchy is simple, but helps us manage the reusable code a bit more.

`libs/`

We're familiar with how applications work: they are compiled with certain instructions specific to them. But if you need to maintain a larger set of functions which aren't specific to this one application, that's where libraries come in. They are reusable sets of code which can be included in applications, but they aren't applications themselves.

The sub-directories in libs/ are the same: reusable, independent sets of code. These should have virtually no default configuration at all and be limited in scope. The application should just include these and then use its own default configuration with them. And note that the libraries can't be executed themselves. Sounds a lot like a Terraform sub-module!

Like in the other hierarchies, you should limit inter-dependencies here as much as possible. An application can load as many libs/ sub-modules as it wants, but if libs/ sub-directories start depending on other libs/, it starts to become more difficult to reason about how things work.

You can also deepen the hierarchy here. If you end up with 20 libs/ Terraform sub-modules, you can put a bunch of them into one sub-directory. It won't make any difference to your apps/ root module. You can also include a testing/ sub-directory for each, to validate your sub-module with some default configuration and another Makefile.

Once all this is implemented, you will have some fairly DRY code that is easy to reason about, easy to maintain, and easy to use.

Everything described above can be extended to whatever kind of Infrastructure-as-Code you need to maintain. For example, you may end up with a Packer configuration and Makefile as another apps/ directory. Or maybe you keep some Ansible roles in libs/ and call them from an ansible apps/ directory.

Are there any other useful directories we can include in our project?

`bin/`

Ah, the old stand-by. Here you can keep the various scripts or wrappers you might need to work within the above structure. Sure, you could make a new directory in apps/ for each one, but let's not get carried away :-) Maybe some scripts to help you run your code in a CI/CD pipeline? Or maybe you'll end up with some sort of generic tool that helps you call commands in your project structure in a certain way...

Drawbacks

There is one big down-side to the libs/ directory. Terraform currently has bugs which prevent it from properly using trees of modules which use relative links to refer to one another. Referring to sub-modules in libs/ using relative paths will work from apps/ root modules, but if a libs/ module tries to refer to yet another module via relative paths, Terraform won't be able to resolve the path. This is due to how Terraform downloads and runs modules in its own .terraform/modules/ directory structure, and does not properly deduce the path based on the original relative path.

Unresolved Questions

As is obvious from Terragrunt and Terraspace, you still need to use the project structure with some kind of "wrapper", or the long paths and command-lines become annoying.

I show how you can use Makefiles with the project structure to deploy changes simply and reliably. But Make eventually becomes complicated and clunky to use this way, so I ended up writing wrappers for apps that I use with this project structure. The wrappers allow you to specify a directory to change to before execution, a series of configuration files to apply one after the other, the ability to load options from yet another json file, etc. I'm pretty sure we've all written one or two of these :-)

boltops-tools / terraspace