Improve Scraper - Githubissues

Use scrapy to create scraper.
Ultimately I'd like a command line utility where the arguments are:
- start url
- depth of the crawl
- whether PDFs, word documents, or images are scraped
- time limit
- maximum number of files to download
I'd like the scraper to output the html and other files into a directory with their original names, ideally retaining the directory structure of the website
Ideally it would handle redirects from the staring URL gracefully

fdhidalgo / govt_transparency