Create a test harness - Githubissues

jaredonline commented 5 years ago

This is a fairly significant change in a few dimensions. I'll go over each, and why I made it, but the high level is:

Use Docker for our test environment (and eventually other environments)
Change the package structure
Use Make for repeatable processes
Setup a test harness

Use Docker

Docker is a complex beast, but it provides a lot of benefit. At a high level it lets us create reproducible, portable environments for running code. This means I can work on either of my laptops, or in the cloud, or on a Windows machine pretty easily, without having to worry about whether or not I have the right version of Python or Anaconda installed, or whatever other dependencies.

In Docker you create "images", which are static, pre-defined environments running an OS and with some number of packages installed. These images are booted into "containers", which are running instances of an image. Think of the image as the starting state for the container, which is the actual running environment. In the environment you can do computer-things, like create files, run scripts, etc. When you shutdown the container, everything you did is erased. If you boot a container from the same image later, you will start back at the pre-defined state of the image, not the state of the container when you shut it down (there are many exceptions to this, but this is the general rule).

The downside is we have to get used to running in a virtual environment, and keeping our Docker stuff up to date as the project changes. IMO the price is worth the benefit.

To facilitate using docker, I created two files:

./docker-compose.yml
./docker/test/Dockerfile

`./docker-compose.yml`

This file is for working with Docker Compose, which is a tool in the Docker suite. It allows you to coordinate various bits of the Docker ecosystem (images, containers, networking, volumes, etc) into bundles called services, and it manages creating and running them in an easy fashion. This way, instead of running 5 or 6 docker commands to launch a container, mount a volume, and run a command, we can just run docker-compose run and it handles the magic.

I commented this file, so if you have questions let me know and I can explain what its doing in more detail.

`./docker/test/Dockerfile`

This file hints at some of the directory structure changes I made. Dockerfile's are the files that Docker uses to make images, which is what it uses to boot containers. Dockerfiles need to be named "Dockerfile" or "Dockerfile-". There are two common ways of organizing Dockerfiles, in folders, or with "Dockerfile-". I prefer the former, as it's easier for me to read

Change the package structure

When trying to get the test library to correctly import timeSeries, I was running into all sorts of pathing issues (Python didn't have the timeSeries.py file in its pre-defined path, and getting it there was a pain). So I did some reading on what the "conventional" way to structure a Python package / library is for external consumption, and landed on this article.

I pretty much adopted that article's suggestions wholesale, with one major exception. Because our library currently has three major languages in it (Docker, Python and Bash), I organize the top-level of the project by language domain. Then for Docker we have environment as the next level of organization, and for Python we have src and tests.

This gives us a structure like this:

├── Makefile
├── README.md
├── docker
│   └── test
│       └── Dockerfile
├── docker-compose.yml
├── docs
│   └── InfoFlow
│       ├── infoFlow.pdf
│       └── infoFlow.pptx
├── python
│   ├── src
│   │   └── module
│   │       ├── __init__.py
│   │       └── file.py
│   ├── templates
│   │   ├── README.md
│   │   ├── module
│   │   │   ├── __init__.py
│   │   │   └── module_temp.py
│   │   └── program_temp.py
│   └── tests
│       └── module
│           └── file_test.py
└── shell
    └── python-tests.sh

Use Make

This one is pretty simple, but when I find myself having to run a series of commands to do something (like run tests), I pretty quickly reach for Make to get them documented and stored in a repeatable place for myself and others. I find Make to be easier than Bash for this sort of thing because of easy it is to declare dependencies and chain commands together.

Setup the test harness

This ties everything else together. Our tests are run inside the Docker container, using docker-compose to make sure our images are up-to-date, create a working environment, run the tests, then tear down the environment.

Our ./bash/python-tests.sh file initializes the Anaconda environment (must be done after the container boots due to a few quirks with how Anaconda works (this is an open issue with Anaconda and probably will not be fixed: https://github.com/ContinuumIO/docker-images/issues/133).

Then it simply calls pytest to scan for tests and run them.

pkmcfarland commented 5 years ago

This all looks great Jared. Good work! Go ahead and merge with master.