Here is how things look on the tiq-test data directory right now:
aperture-2:data alexcp$ ls
enriched population raw
aperture-2:data alexcp$ ls raw
public_inbound public_outbound
aperture-2:data alexcp$ ls raw/pu
public_inbound/ public_outbound/
aperture-2:data alexcp$ ls raw/public_inbound/
20140615.csv.gz 20140618.csv.gz 20140622.csv.gz 20140625.csv.gz 20140628.csv.gz 20140701.csv.gz 20140704.csv.gz 20140707.csv.gz 20140710.csv.gz 20140713.csv.gz
20140616.csv.gz 20140619.csv.gz 20140623.csv.gz 20140626.csv.gz 20140629.csv.gz 20140702.csv.gz 20140705.csv.gz 20140708.csv.gz 20140711.csv.gz 20140714.csv.gz
20140617.csv.gz 20140620.csv.gz 20140624.csv.gz 20140627.csv.gz 20140630.csv.gz 20140703.csv.gz 20140706.csv.gz 20140709.csv.gz 20140712.csv.gz 20140715.csv.gz
Basically we have the following structure:
data/[DATATYPE]/[DATAGROUP]/[YYYYMMDD].csv.gz considering that:
DATATYPE should be either raw or enriched. The names are references to what to expect on the data structure of the CSVs inside (as described on the README). Disregard the population type, it should not be a target for this presentation.
DATAGROUP is in reference to the group name of the combine output (currently the "inbound" and "outbound" separation). They can be whatever you like, I am using public_inbound and public_outbound for the presentation data.
YYYYMMDDis the way dates should be represented in the whole world.
Please note the CSVs are gzipped. The code expects that as well.
MOAR work!
Here is how things look on the
tiq-test
data directory right now:Basically we have the following structure:
data/[DATATYPE]/[DATAGROUP]/[YYYYMMDD].csv.gz
considering that:DATATYPE
should be eitherraw
orenriched
. The names are references to what to expect on the data structure of the CSVs inside (as described on the README). Disregard thepopulation
type, it should not be a target for this presentation.DATAGROUP
is in reference to the group name of the combine output (currently the "inbound" and "outbound" separation). They can be whatever you like, I am usingpublic_inbound
andpublic_outbound
for the presentation data.YYYYMMDD
is the way dates should be represented in the whole world.Please note the CSVs are gzipped. The code expects that as well.