scrapinghub / spidermon

Scrapy Extension for monitoring spiders execution.
https://spidermon.readthedocs.io
BSD 3-Clause "New" or "Revised" License
528 stars 96 forks source link

Fix "ModuleNotFoundError" when running spider outside Scrapy Cloud #340

Closed rennerocha closed 2 years ago

rennerocha commented 2 years ago

When install master branch version and try to run a simple spider, we got a ModuleNotFoundError complaining about scrapinghub module. This module is necessary if we are running our spiders in Scrapy Cloud, however, this is not a requirement if we are running in other environments, so we shouldn't require that module to be installed.

2022-03-25 17:52:03 [scrapy.utils.log] INFO: Scrapy 2.6.1 started (bot: tutorial)
2022-03-25 17:52:03 [scrapy.utils.log] INFO: Versions: lxml 4.8.0.0, libxml2 2.9.12, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 22.2.0, Python 3.8.10 (default, Nov 26 2021, 20:14:08) - [GCC 9.3.0], pyOpenSSL 22.0.0 (OpenSSL 1.1.1n  15 Mar 2022), cryptography 36.0.2, Platform Linux-5.4.0-105-generic-x86_64-with-glibc2.29
2022-03-25 17:52:03 [scrapy.crawler] INFO: Overridden settings:
{'BOT_NAME': 'tutorial',
 'NEWSPIDER_MODULE': 'tutorial.spiders',
 'ROBOTSTXT_OBEY': True,
 'SPIDER_MODULES': ['tutorial.spiders']}
2022-03-25 17:52:03 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.epollreactor.EPollReactor
2022-03-25 17:52:03 [scrapy.extensions.telnet] INFO: Telnet Password: 97060e9bc44a7717
Traceback (most recent call last):
  File "/home/renne/.pyenv/versions/spidermon-test-clean-install/bin/scrapy", line 8, in <module>
    sys.exit(execute())
  File "/home/renne/.pyenv/versions/spidermon-test-clean-install/lib/python3.8/site-packages/scrapy/cmdline.py", line 145, in execute
    _run_print_help(parser, _run_command, cmd, args, opts)
  File "/home/renne/.pyenv/versions/spidermon-test-clean-install/lib/python3.8/site-packages/scrapy/cmdline.py", line 100, in _run_print_help
    func(*a, **kw)
  File "/home/renne/.pyenv/versions/spidermon-test-clean-install/lib/python3.8/site-packages/scrapy/cmdline.py", line 153, in _run_command
    cmd.run(args, opts)
  File "/home/renne/.pyenv/versions/spidermon-test-clean-install/lib/python3.8/site-packages/scrapy/commands/crawl.py", line 22, in run
    crawl_defer = self.crawler_process.crawl(spname, **opts.spargs)
  File "/home/renne/.pyenv/versions/spidermon-test-clean-install/lib/python3.8/site-packages/scrapy/crawler.py", line 205, in crawl
    crawler = self.create_crawler(crawler_or_spidercls)
  File "/home/renne/.pyenv/versions/spidermon-test-clean-install/lib/python3.8/site-packages/scrapy/crawler.py", line 238, in create_crawler
    return self._create_crawler(crawler_or_spidercls)
  File "/home/renne/.pyenv/versions/spidermon-test-clean-install/lib/python3.8/site-packages/scrapy/crawler.py", line 313, in _create_crawler
    return Crawler(spidercls, self.settings, init_reactor=True)
  File "/home/renne/.pyenv/versions/spidermon-test-clean-install/lib/python3.8/site-packages/scrapy/crawler.py", line 87, in __init__
    self.extensions = ExtensionManager.from_crawler(self)
  File "/home/renne/.pyenv/versions/spidermon-test-clean-install/lib/python3.8/site-packages/scrapy/middleware.py", line 59, in from_crawler
    return cls.from_settings(crawler.settings, crawler)
  File "/home/renne/.pyenv/versions/spidermon-test-clean-install/lib/python3.8/site-packages/scrapy/middleware.py", line 40, in from_settings
    mwcls = load_object(clspath)
  File "/home/renne/.pyenv/versions/spidermon-test-clean-install/lib/python3.8/site-packages/scrapy/utils/misc.py", line 61, in load_object
    mod = import_module(module)
  File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 848, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/home/renne/projects/rennerocha/spidermon/spidermon/contrib/scrapy/extensions.py", line 11, in <module>
    from spidermon.utils.hubstorage import hs
  File "/home/renne/projects/rennerocha/spidermon/spidermon/utils/hubstorage.py", line 8, in <module>
    from scrapinghub import HubstorageClient
ModuleNotFoundError: No module named 'scrapinghub'
tcurvelo commented 2 years ago

Might be fixed by #323 once it gets merged. I did some rework on the scrapinghub module there.