About data structure
The main folder is AllSpiders.
It contains the following subfolders:
RunAllSpiders.bat - It run all spiders
!!! Current date must be entered as a parameter Date format must be: YYYY-mm-dd (2017-05-06) It will serve to be include into report's name for eachone running spider (Blitz-2017-05-01.json)
Blitz - information about specific spider. The same set will be repeated for each spider
BlitzSpider.py - spider program Cleaning.py - specific to Blitz data verification program RunIt.bat - It run only current spider (BlitzSpider at the moment) !!! Current date must be entered as a parameter Date format must be: YYYY-mm-dd (2017-05-06) It will serve to be include into report's name for the current running spider (Blitz-2017-05-01.json)
Logs - subdirectory. output.txt - contains printout of latest BlitzSpider run .
Reports - subdirectory. Blitz-2017-05-01.json - spider result Blitz-2017-05-02.json - spider result Blitz-2017-05-03.json - spider result Blitz-2017-05-04.json - spider result
How to get sources: