The ETL code responsible for filling ActiveData.
Probably not. The majority of the code implements a high volume idiosyncratic
data pipeline on top of AWS services, and requires other services to work in
tandem with this. But, feel free to pillage activedata_etl/imports
or
activedata_etl/transforms
for the transformation code.
Many branches are meant as stable versions for each of the processes involved in the ETL. Ideally, they would be unified, but library upgrades can cause unique instability: deployment of a branch does not happen until (manual) testing has been done.
Here are the important branches:
It is 2016, and Python is still hard on Windows. It would be a nice question for Stack Overflow, but apparently not.
pip install fabric
- There will be errorspip install fabric
again. This should be successful.The configuration files, located in resources/settings
, often point to a private.json
config file outside the repository tree. This file holds the credentials and access info required, and looks something like this:
{
"email":{
"host": "smtp.gmail.com",
"port": 465,
"username": "",
"password": "",
"use_ssl": 1
},
"aws_credentials":{
"aws_access_key_id":"",
"aws_secret_access_key" :"",
"region":"us-west-2"
},
"pulse_user":{
"user": "",
"password": ""
}
}
The exact properties will depend on the the resources you are accessing.