alecxe / scrapy-fake-useragent

Random User-Agent middleware based on fake-useragent
MIT License
687 stars 98 forks source link

the library doesn't work anymore #4

Closed mayouf closed 8 years ago

mayouf commented 8 years ago

here is the error:

`2016-07-31 23:12:31 [twisted] CRITICAL:

Traceback (most recent call last): File "/home/mic/anaconda2/lib/python2.7/site-packages/twisted/internet/defer.py", line 1128, in _inlineCallbacks result = g.send(result) File "/home/mic/anaconda2/lib/python2.7/site-packages/scrapy/crawler.py", line 90, in crawl six.reraise(*exc_info) File "/home/mic/anaconda2/lib/python2.7/site-packages/scrapy/crawler.py", line 72, in crawl self.engine = self._create_engine() File "/home/mic/anaconda2/lib/python2.7/site-packages/scrapy/crawler.py", line 97, in _createengine return ExecutionEngine(self, lambda : self.stop()) File "/home/mic/anaconda2/lib/python2.7/site-packages/scrapy/core/engine.py", line 68, in init self.downloader = downloader_cls(crawler) File "/home/mic/anaconda2/lib/python2.7/site-packages/scrapy/core/downloader/init.py", line 88, in init self.middleware = DownloaderMiddlewareManager.from_crawler(crawler) File "/home/mic/anaconda2/lib/python2.7/site-packages/scrapy/middleware.py", line 58, in from_crawler return cls.from_settings(crawler.settings, crawler) File "/home/mic/anaconda2/lib/python2.7/site-packages/scrapy/middleware.py", line 40, in from_settings mw = mwcls() File "/home/mic/anaconda2/lib/python2.7/site-packages/scrapy/downloadermiddlewares/useragent.py", line 32, in init self.ua = UserAgent() File "/home/mic/anaconda2/lib/python2.7/site-packages/fake_useragent/fake.py", line 10, in init self.data = load_cached() File "/home/mic/anaconda2/lib/python2.7/site-packages/fake_useragent/utils.py", line 141, in load_cached update() File "/home/mic/anaconda2/lib/python2.7/site-packages/fake_useragent/utils.py", line 136, in update write(load()) File "/home/mic/anaconda2/lib/python2.7/site-packages/fake_useragent/utils.py", line 93, in load browsers_dict[browser_key] = get_browser_versions(browser) File "/home/mic/anaconda2/lib/python2.7/site-packages/fake_useragent/utils.py", line 56, in get_browser_versions html = html.split('<div id=\'liste\'>')[1] IndexError: list index out of range `

alecxe commented 8 years ago

Thanks for the report. You've mentioned anymore - did it work before? What changed in your setup between it working and it stopping to work?

alecxe commented 8 years ago

@mayouf by the way, what if you try to update the databases or with cache=False flag?

joskfg commented 8 years ago

It fails because the web used to retrieve the user agents changed, so it can't parse the HTML and fails to retrieve the possible user agents to use.

alecxe commented 8 years ago

The library is a tremendously thin layer for using fake-useragent library. I'm pretty sure, if there is an issue it may be better to discuss at fake-useragent issue tracker. Thanks for the report though!