crawlab-team / crawlab-sdk

SDK for Crawlab, including SDK for different programming languages such as Python, Node.js and Java, and a CLI Tool written in Python.
https://crawlab.cn
BSD 3-Clause "New" or "Revised" License
55 stars 49 forks source link

ModuleNotFoundError: No module named 'legacy' #37

Open zq99299 opened 4 months ago

zq99299 commented 4 months ago
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/scrapy/crawler.py", line 265, in crawl
    return self._crawl(crawler, *args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/scrapy/crawler.py", line 269, in _crawl
    d = crawler.crawl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/twisted/internet/defer.py", line 2260, in unwindGenerator
    return _cancellableInlineCallbacks(gen)
  File "/usr/local/lib/python3.10/dist-packages/twisted/internet/defer.py", line 2172, in _cancellableInlineCallbacks
    _inlineCallbacks(None, gen, status, _copy_context())
--- <exception caught here> ---
  File "/usr/local/lib/python3.10/dist-packages/twisted/internet/defer.py", line 2003, in _inlineCallbacks
    result = context.run(gen.send, result)
  File "/usr/local/lib/python3.10/dist-packages/scrapy/crawler.py", line 158, in crawl
    self.engine = self._create_engine()
  File "/usr/local/lib/python3.10/dist-packages/scrapy/crawler.py", line 172, in _create_engine
    return ExecutionEngine(self, lambda _: self.stop())
  File "/usr/local/lib/python3.10/dist-packages/scrapy/core/engine.py", line 100, in __init__
    self.scraper = Scraper(crawler)
  File "/usr/local/lib/python3.10/dist-packages/scrapy/core/scraper.py", line 109, in __init__
    self.itemproc: ItemPipelineManager = itemproc_cls.from_crawler(crawler)
  File "/usr/local/lib/python3.10/dist-packages/scrapy/middleware.py", line 90, in from_crawler
    return cls.from_settings(crawler.settings, crawler)
  File "/usr/local/lib/python3.10/dist-packages/scrapy/middleware.py", line 66, in from_settings
    mwcls = load_object(clspath)
  File "/usr/local/lib/python3.10/dist-packages/scrapy/utils/misc.py", line 79, in load_object
    mod = import_module(module)
  File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import

  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load

  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked

  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked

  File "<frozen importlib._bootstrap_external>", line 883, in exec_module

  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed

  File "/usr/local/lib/python3.10/dist-packages/crawlab/__init__.py", line 1, in <module>
    from crawlab.crawler import Crawler
  File "/usr/local/lib/python3.10/dist-packages/crawlab/crawler.py", line 3, in <module>
    from crawlab.utils import create_handler
  File "/usr/local/lib/python3.10/dist-packages/crawlab/utils/__init__.py", line 3, in <module>
    from crawlab.utils.data import save_item_mongo, save_item_sql, save_item_kafka, save_item_es
  File "/usr/local/lib/python3.10/dist-packages/crawlab/utils/data.py", line 2, in <module>
    from legacy.db import index_item
builtins.ModuleNotFoundError: No module named 'legacy'

2024-06-14 13:52:22 - CRITICAL - 
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/twisted/internet/defer.py", line 2003, in _inlineCallbacks
    result = context.run(gen.send, result)
  File "/usr/local/lib/python3.10/dist-packages/scrapy/crawler.py", line 158, in crawl
    self.engine = self._create_engine()
  File "/usr/local/lib/python3.10/dist-packages/scrapy/crawler.py", line 172, in _create_engine
    return ExecutionEngine(self, lambda _: self.stop())
  File "/usr/local/lib/python3.10/dist-packages/scrapy/core/engine.py", line 100, in __init__
    self.scraper = Scraper(crawler)
  File "/usr/local/lib/python3.10/dist-packages/scrapy/core/scraper.py", line 109, in __init__
    self.itemproc: ItemPipelineManager = itemproc_cls.from_crawler(crawler)
  File "/usr/local/lib/python3.10/dist-packages/scrapy/middleware.py", line 90, in from_crawler
    return cls.from_settings(crawler.settings, crawler)
  File "/usr/local/lib/python3.10/dist-packages/scrapy/middleware.py", line 66, in from_settings
    mwcls = load_object(clspath)
  File "/usr/local/lib/python3.10/dist-packages/scrapy/utils/misc.py", line 79, in load_object
    mod = import_module(module)
  File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/usr/local/lib/python3.10/dist-packages/crawlab/__init__.py", line 1, in <module>
    from crawlab.crawler import Crawler
  File "/usr/local/lib/python3.10/dist-packages/crawlab/crawler.py", line 3, in <module>
    from crawlab.utils import create_handler
  File "/usr/local/lib/python3.10/dist-packages/crawlab/utils/__init__.py", line 3, in <module>
    from crawlab.utils.data import save_item_mongo, save_item_sql, save_item_kafka, save_item_es
  File "/usr/local/lib/python3.10/dist-packages/crawlab/utils/data.py", line 2, in <module>
    from legacy.db import index_item
ModuleNotFoundError: No module named 'legacy'

使用的 docker 镜像是 crawlabteam/crawlab , crawlab-sdk==0.6.2,按照基础教程 https://docs.crawlab.cn/zh/guide/basic-tutorial/ 配置

ITEM_PIPELINES = {
    'crawlab.CrawlabPipeline': 300,
}

后运行任务,就报错