ryan-neal / Warren-Buffett

Other
1 stars 0 forks source link

Data handling improvements #20

Closed ck25 closed 6 years ago

ck25 commented 6 years ago
  1. src/global_settings.py This contains some of the global variables that may be used throughout the project Makes the various path variables (e.g. path to the data directory) easier to access and readable

  2. src/data/make_dataset.py Now has simple validation of the preexisting scraped data

  3. src/data/load_reports.py, src/data/mongodb.py I simplified the model object so that new 'models' can be created easily. All you need for a new model is to define the FIELDS and COLLECTION_NAME and then in data_from_file() define how you will be extracting data from the file source.

  4. Some python path setup troubleshooting in the readme