greenelab / nature_news_disparities

Analysis pipeline for Nature news articles
BSD 3-Clause "New" or "Revised" License
5 stars 1 forks source link

scraper creating empty index.html files #6

Open nrosed opened 3 years ago

nrosed commented 3 years ago

Scrapy is currently creating empty index.html files when a link is redirected. This has only been observed in 2020 and should be taken care of within the scraping code, not the downstream processes.

nrosed commented 3 years ago

This doesn't have a downstream effect on the analyses and seems non-trivial to fix. Therefore, it won't be fixed.