Python defaulting to ASCII encoding instead of Unicode.
Set global encoding to 'utf-8'
Sript downloading 0kb empty FILE objects.
Some files being generated contained ":", which is a restricted
character in filenames on Windows and MacOS and was truncating filenames
before the extension, added to the replacements list in clean_text.
Download failures.
Used the requests library to improve success rates of downloads, which
seemed to work a bit better.
Fixed the following errors -