This repo represents the work of the Transportation Systems project of Hack Oregon. We are a volunteer project for Open Data.
This repo is intended to be run in a docker environment.
As you probably know, text files on Linux (where we deploy and where our containers run) have by convention lines ending with LF
. However, Windows, where some of us test and develop, uses the convention that lines in a text file end with a CR
and a LF
.
Git knows what we're using, and it tries to accomodate us by checking out working files with the line endings our platform uses. See this explainer from GitHub on how line endings work with Git: https://help.github.com/articles/dealing-with-line-endings/.
Now, throw in Docker and Docker Compose. Some Dockerfiles call for copying files from your host into the image during the build. If those files came from a Git repo, the default is that their line endings are your host native convention. But inside the image, which is a Linux filesystem, some files must use the Linux line ending convention or they won't work.
So far, we know that Bash and sh
scripts will crash if they have Windows line endings, often with mysterious error messages. Also, Python scripts like Django's manage.py
, when executed as commands, will crash mysteriously if they have Windows line endings.
One final note: vim
has a command :setfileformat
. Either :setfileformat unix
if you want the Unix / Linux convention or :setfileformat dos
if you want the DOS / Windows convention. Then save the file and edit your .gitattributes
file to declare that the file must have that convention on checkout.
Sometimes, Docker for Windows loses contact with some critical resource and throws ugly messages like this:
ERROR: for transportationsystembackend_db_1 Cannot start service db: driver failed programming external connectivity on endpoint transportationsystembackend_db_1 (f944aeb0244747359af77373b4949561c6e6e1d8ee48fb0bfc8aba98aa32877e): Error starting userland proxy: mkdir /port/tcp:0.0.0.0:5439:tcp:172.18.0.2:5432: input/output error
ERROR: for db Cannot start service db: driver failed programming external connectivity on endpoint transportationsystembackend_db_1 (f944aeb0244747359af77373b4949561c6e6e1d8ee48fb0bfc8aba98aa32877e): Error starting userland proxy: mkdir /port/tcp:0.0.0.0:5439:tcp:172.18.0.2:5432: input/output error
Encountered errors while bringing up the project.
If this happens, you will need to restart Docker. Open the Settings
dialog and go to Reset
. Select the Restart
option (the top one). Wait till the green Docker is running
light shows up and then go back to your terminal. Everything should then work. This is a known Docker for Windows bug, not something you did wrong.
In order to run this you will want to:
Clone this Repository
cd
into it
The environment variables that Docker uses and inserts into the images it builds are taken from a file in the root of this repository called .env
. Because it contains sensitive information like passwords, it is not checked into version control - you have to create it as follows:
env.sample
to .env
: cp env.sample .env
.env
and change at least POSTGRES_PASSWORD
and DJANGO_SECRET_KEY
. You should not need to change any of the others during test and development.Download the database .backup
files from Google Drive and place them in ./Backups
before doing the Docker build. The build will copy them onto the image and the first "run" in a container will restore them. See Automatic database restores for the details on the restore mechanism.
The .env
file and the .backup
files have been added to the .gitignore
file. Provided you do not rename them or change locations they will not be committed to the repo and this project will build and run.
Confirm you have executable perms on all the scripts in the ./bin
folder: $ chmod +x ./bin/*.sh
Feel free to read each one and assign perms individually, cause it is your computer :stuck_out_tongue_winking_eye: and security is a real thing.
Run the build.sh
script to build the project. Since you are going to be running it on the local machine you will want to run: ./bin/build.sh -l
- This command is doing a docker-compose build in the background. It is downloading the images needed for the project to your local machine.
Once this completes you will now want to start up the project. We will use the start.sh script for this, again using the -l
flag to run locally: ./bin/start.sh -l
The first time you run this you will see the database restores. You can ignore the error messages. You will also see the api container start up.
Once the first startup completes kill the container using cmd c/ctrl c depending on your os.
Restart the container using the same start command: ./bin/start.sh -l
and both the db and the api will start up.
Open your browser and you will be able to access the Django Rest Framework browserable front end at http://localhost:8000/api
, the Swagger API schema at http://localhost:8000/schema
, and the Django admin
login at http://localhost:8000/admin
.
To Run Tests: run the ./bin/build.sh -l
followed by the ./bin/test.sh -l
command.
Note that the api
container will write some files into your Git repository. They're in .gitignore
, so they won't be checked into version control.
While developing the API, using the built in dev server is useful as it allows for live reloading, and debug messages. When running in a production environment, this is a security risk, and not efficient. As such a staging/production environment has been created using the following technologies:
copy the /bin/env.staging.sample
file to create a .env.staging
file in same directory:
$ cp ./bin/env.staging.sample ./bin/.env.staging
open the ./bin/.env
in your text editor and complete the environmental variables.
Download and save the sql file if you have not already.
Run the build.sh
script to build the project for the staging environment: $ ./bin/build.sh -s
Start the project using the staging flag: $ ./bin/start.sh -s
Open your browser and you should be able to access the Django Restframework browserable front end at: http://localhost:8000/api and Swagger at http://localhost:8000/schema
Try going to an nonexistent page and you should see a generic 404 Not found page instead of the Django debug screen.
So what is changed from the default Django setup for the staging environment. This already has been done, being included for informational purposes
requirements.txt
.env
(In future maybe setup a separate env variable for environment?)gunicorn crash_data_api.wsgi -c gunicorn_config.py
gunicorn_config.py
file to hold gunicorn config, including using gevent worker_class. Currently we are patching psycopg2 and django with gevent/psycogreen in the post_fork worker. Also using 4 workers:try:
# fail 'successfully' if either of these modules aren't installed
from gevent import monkey
from psycogreen.gevent import patch_psycopg
# setting this inside the 'try' ensures that we only
# activate the gevent worker pool if we have gevent installed
worker_class = 'gevent'
workers = 4
# this ensures forked processes are patched with gevent/gevent-psycopg2
def do_post_fork(server, worker):
monkey.patch_all()
patch_psycopg()
# you should see this text in your gunicorn logs if it was successful
worker.log.info("Made Psycopg2 Green")
post_fork = do_post_fork
except ImportError:
pass
if os.environ.get('DEBUG') == "False":
DATABASES = {
'default': {
'ENGINE': 'django_db_geventpool.backends.postgis',
'PASSWORD': os.environ.get('POSTGRES_PASSWORD'),
'NAME': os.environ.get('POSTGRES_NAME'),
'USER': os.environ.get('POSTGRES_USER'),
'HOST': os.environ.get('POSTGRES_HOST'),
'PORT': os.environ.get('POSTGRES_PORT'),
'CONN_MAX_AGE': 0,
'OPTIONS': {
'MAX_CONNS': 20
}
}
}
Change DEBUG line:
DEBUG = os.environ.get('DEBUG') == "True" - handles os variables being treated as strings
ADD to MIDDLEWARE right after SECURITY:
'whitenoise.middleware.WhiteNoiseMiddleware',
ADD these just before the STATIC_URL so staticfiles are handled correctly and are compressed:
STATICFILES_STORAGE = 'whitenoise.storage.CompressedManifestStaticFilesStorage'
STATIC_ROOT = os.path.join(BASE_DIR, 'staticfiles')
To develop on the repo,
Create an issue for tracking and communication
You will clone repo and then create a feature branch.
After branching confirm you can follow above get started steps.
Develop you feature
Update documentation and env sample file as necessary.
Commit Changes.
Merge current Staging branch into feature branch to resolve any merge conflicts.
Push local feature branch to Hack Oregon repo. Any PR requests from forks will be rejected.
Create a Pull Request to staging branch. No PRs will be accepted to Master unless from staging and by approved reviewers
PR should be reviewed by authorized reviewer, another team member if possible, and pass any automated testing requirements.
Any outstanding merge conflicts resolved
Authorized reviewer will commit to staging.
Process for staging to master will be defined.
The primary function of this API is to act as a read-only wrapper around ODOT's Crash data and expose the underlying data to the web via HTTP Requests. The secondary function is eventually expose helper functions that could simplify data pre-processing via in-built helper functions. This API aims to be RESTful.
The models in this project are unmanaged. Given that a) the API sits upon a legacy database and b) the API is intended to be read-only, the decision was made to decouple Django from database management and isolate that solely to the underlying PostGres shell environment. This is to prevent creation and deletions of the underlying data tables primarily during development. Malicious editing (outside of the dev environment) is less of a concern since that can be handled by a secure permissions for users making API calls.
All users can browse the API. Read-only access is the default permission for unauthenticated users.
Testing an unmanaged model requires a few modifications to the test runner. Since migrations don't create any tables, they create a blank test database which results in no test data being found. The fix is outlined in the following post - https://dev.to/patrnk/testing-against-unmanaged-models-in-django
Runnning a test requires you have 'django-test-without-migrations' as part of your requirements. The only other point to remember is that tests need to be run with ./manage.py test --no-migrations
flag to prevent Django from trying to run migrations on your test db.
TBD
TBD
TBD
Three types of filters are currently supported -
Simple text search can be performed on the following fields:
'crash_id','crash_hr_short_desc','urb_area_short_nm','fc_short_desc','hwy_compnt_short_desc','mlge_typ_short_desc', 'specl_jrsdct_short_desc','jrsdct_grp_long_desc','st_full_nm','isect_st_full_nm','rd_char_short_desc', 'isect_typ_short_desc','crash_typ_short_desc','collis_typ_short_desc','rd_cntl_med_desc','wthr_cond_short_desc','rd_surf_short_desc','lgt_cond_short_desc','traf_cntl_device_short_desc','invstg_agy_short_desc','crash_cause_1_short_desc','crash_cause_2_short_desc','crash_cause_3_short_desc','pop_rng_med_desc','rd_cntl_med_desc'
TBD
TBD
To look for all fields listed above that match (not exact) the string "DIS-RAG" -
http://localhost:8000/api/crashes/?search=DIS--RAG
The API also supports explicit filter fields as part of URL query strings. The following fields are currently supported -
'ser_no','cnty_id','alchl_invlv_flg','crash_day_no','crash_mo_no','crash_yr_no','crash_hr_no','schl_zone_ind','wrk_zone_ind','alchl_invlv_flg','drug_invlv_flg','crash_speed_invlv_flg','crash_hit_run_flg'
Note:
If filtering just "00173" and "00174" for the field 'ser_no' -
http://localhost:8000/api/crashes/?ser_no=00173&ser_no=00174
Results can be sorted against any field or combinations of fields.
To show results in ascending order of the field 'ser_no':
http://localhost:8000/api/crashes/?ordering=ser_no
In descending order:
http://localhost:8000/api/crashes/?ordering=-ser_no
multiple fields:
http://localhost:8000/api/crashes/?ordering=-ser_no,rd_cntl_med_desc
The API supports Accept Header Versioning. Version numbers in API requests are optional and if no version is specified the request header latest version is returned by default. Specify versions as numbers, as shown in header example below -
GET /api/crashes HTTP/1.1
Host: example.com:8000
Accept: application/json; version=1.0
Latest version: 1.0 (as of 02/19/2018)
We follow the MIT License: https://github.com/hackoregon/transportation-system-backend/blob/staging/LICENSE