Closed JackDI1 closed 8 months ago
I have downloaded the data locally however, because the data is so large I don't think we should be storing the data on GitHub, potentially could we store the raw data in S3 then the processed data in a DB as suggested by Adam
I tried running this locally. Unfortunately my laptop crashed 3 times
The LOB dataset is approximately 6.5GB. Our initial processing only ran on 2 of the files. So we can carry out the initial EDA, the full dataset needs to be processed.