I created src/02_preprocess_data.R which takes in the files downloaded by src/01_download_data.R.
I also made minor changes to script 2 which downloads the files:
Changed the name so that they are in logical order (src/download_data.R → src/01_download_data.R)
Changed write.csv to download.file. I noticed there were problems with how the files were downloading when we read in as a DataFrame and then write out as a csv. I think it is a better practise to just download the raw file.
I also added some new folder structure in the data folder.
I created src/02_preprocess_data.R which takes in the files downloaded by src/01_download_data.R.
I also made minor changes to script 2 which downloads the files:
write.csv
todownload.file
. I noticed there were problems with how the files were downloading when we read in as a DataFrame and then write out as a csv. I think it is a better practise to just download the raw file.I also added some new folder structure in the data folder.