Our scraping configurations found in the data folder are a little messy and redundant. This is a refactor, so we are not changing the functionality (how the scraper works).
TODO
[x] Remove companies.csv
[x] Refactor scraping_config.json so that an individual company is contained with its own json file. For example, if an imaginary scraping_config.json had 2 companies within it- Amazon and Walmart- then copy over all the configuration information for Amazon into a new Amazon.json. Do the same for Walmart (Walmart.json) and remove scraping_config.json when finished
[x] Modify any scripts/functions that view, edit, modify, or write to JSON configuration files and ensure that their behaviors match the 2 changes detailed above. You can remove unnecessary scripts/function (i.e. there is one that acts on the companies.csv data, but that is no longer used)
Notes
This will require you to refactor a lot of the scripts that deal with configuration information. Ensure that the program works as expected when finished and incrementally test the features (i.e. make small changes and test extensively before moving on to next thing).
Context
Our scraping configurations found in the
data
folder are a little messy and redundant. This is a refactor, so we are not changing the functionality (how the scraper works).TODO
companies.csv
scraping_config.json
so that an individual company is contained with its own json file. For example, if an imaginaryscraping_config.json
had 2 companies within it- Amazon and Walmart- then copy over all the configuration information for Amazon into a newAmazon.json
. Do the same for Walmart (Walmart.json
) and removescraping_config.json
when finishedcompanies.csv
data, but that is no longer used)Notes
This will require you to refactor a lot of the scripts that deal with configuration information. Ensure that the program works as expected when finished and incrementally test the features (i.e. make small changes and test extensively before moving on to next thing).