Open miquelramirez opened 5 years ago
There are a number of improvements on the data preparation process that will help a lot down the road:
prepare_data.py
pandas
numpy
gz
development
There are a number of improvements on the data preparation process that will help a lot down the road:
prepare_data.py
so we usepandas
for generating the datasets rather than manually vianumpy
, which is easier to maintain and to read.gz
)development
option to generate a small sample of the working dataset