blekhmanlab / rxivist

API providing access to papers and authors scraped from biorxiv.org
https://rxivist.org
GNU Affero General Public License v3.0
60 stars 11 forks source link

Spider error when refreshing download numbers #234

Open rabdill opened 5 years ago

rabdill commented 5 years ago
Refreshing article 19143
Error requesting article metrics. Retrying: HTTPSConnectionPool(host='www.biorxiv.org', port=443): Max retries exceeded with url: /content/early/2016/11/30/061689.article-metrics (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f942ac57e48>: Failed to establish a new connection: [Errno -2] Name or service not known',))
Error requesting article metrics. Retrying: HTTPSConnectionPool(host='www.biorxiv.org', port=443): Max retries exceeded with url: /content/early/2016/11/30/061689.article-metrics (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f942ac05c18>: Failed to establish a new connection: [Errno -2] Name or service not known',))
Error requesting article metrics. Retrying: HTTPSConnectionPool(host='www.biorxiv.org', port=443): Max retries exceeded with url: /content/early/2016/11/30/061689.article-metrics (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f942ac05860>: Failed to establish a new connection: [Errno -2] Name or service not known',))
Error AGAIN requesting article metrics. Bailing: HTTPSConnectionPool(host='www.biorxiv.org', port=443): Max retries exceeded with url: /content/early/2016/11/30/061689.article-metrics (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f942ac05978>: Failed to establish a new connection: [Errno -2] Name or service not known',))
Traceback (most recent call last):
  File "spider.py", line 1136, in <module>
    full_run(spider)
  File "spider.py", line 983, in full_run
    spider.refresh_article_stats(collection, config.refresh_category_cap)
  File "spider.py", line 330, in refresh_article_stats
    self.save_article_stats(article_id, stat_table)
  File "spider.py", line 493, in save_article_stats
    for i, record in enumerate(stats):
TypeError: 'NoneType' object is not iterable