Our file downloaders could use a bit of a rework. They seem overly complex and only able to support a few different file types; with various modules calling to each other and requiring a specific order that is unclear. Not to mention all the defunct scripts littered about. I believe a much more straightforward approach is possible and will go a long way in helping people understand how and when to use our util modules. During work on #227, I found this way that will download any file type when provided with a download url:
r = requests.get(url, stream=True)
with open(file_path, 'wb') as fd:
for chunk in r.iter_content():
fd.write(chunk)
Context
Our file downloaders could use a bit of a rework. They seem overly complex and only able to support a few different file types; with various modules calling to each other and requiring a specific order that is unclear. Not to mention all the defunct scripts littered about. I believe a much more straightforward approach is possible and will go a long way in helping people understand how and when to use our
util
modules. During work on #227, I found this way that will download any file type when provided with a downloadurl
:SEE: downloaders.py, get_files.py, muckrock_scraper.py
Requirements
util
scriptsDocs
util
scripts should be updated where necessaryOpen questions