wondersell / wildsearch-crawler

Инструмент сбора данных о разделах, товарах и позициях товаров в разделах Wildberries и других российских маркетплейсов
88 stars 33 forks source link

Error in example #3

Open berlinhemi opened 2 years ago

berlinhemi commented 2 years ago

Запустил пример из README, но данных не получил... Может изменилась разметка сайта?

scrapy crawl wb -o artifacts/wb.json -a category_url="https://www.wildberries.ru/catalog/zhenshchinam/odezhda/vodolazki"

[2021-10-28 14:31:12,290][INFO] Scrapy 2.4.0 started (bot: wildsearch_crawler)
2021-10-28 14:31:12 [scrapy.utils.log] INFO: Scrapy 2.4.0 started (bot: wildsearch_crawler)
[2021-10-28 14:31:12,301][INFO] Versions: lxml 4.6.3.0, libxml2 2.9.10, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 20.3.0, Python 3.7.3 (default, Jan 22 2021, 20:04:44) - [GCC 8.3.0], pyOpenSSL 21.0.0 (OpenSSL 1.1.1d  10 Sep 2019), cryptography 2.6.1, Platform Linux-4.19.0-6-amd64-x86_64-with-debian-10.11
2021-10-28 14:31:12 [scrapy.utils.log] INFO: Versions: lxml 4.6.3.0, libxml2 2.9.10, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 20.3.0, Python 3.7.3 (default, Jan 22 2021, 20:04:44) - [GCC 8.3.0], pyOpenSSL 21.0.0 (OpenSSL 1.1.1d  10 Sep 2019), cryptography 2.6.1, Platform Linux-4.19.0-6-amd64-x86_64-with-debian-10.11
[2021-10-28 14:31:12,301][DEBUG] Using reactor: twisted.internet.epollreactor.EPollReactor
2021-10-28 14:31:12 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.epollreactor.EPollReactor
[2021-10-28 14:31:12,308][INFO] Overridden settings:
{'BOT_NAME': 'wildsearch_crawler',
 'NEWSPIDER_MODULE': 'wildsearch_crawler.spiders',
 'SPIDER_MODULES': ['wildsearch_crawler.spiders']}
2021-10-28 14:31:12 [scrapy.crawler] INFO: Overridden settings:
{'BOT_NAME': 'wildsearch_crawler',
 'NEWSPIDER_MODULE': 'wildsearch_crawler.spiders',
 'SPIDER_MODULES': ['wildsearch_crawler.spiders']}
[2021-10-28 14:31:12,334][INFO] Telnet Password: 42277048cc099d2a
2021-10-28 14:31:12 [scrapy.extensions.telnet] INFO: Telnet Password: 42277048cc099d2a
[2021-10-28 14:31:12,366][INFO] Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.memusage.MemoryUsage',
 'scrapy.extensions.feedexport.FeedExporter',
 'scrapy.extensions.logstats.LogStats']
2021-10-28 14:31:12 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.memusage.MemoryUsage',
 'scrapy.extensions.feedexport.FeedExporter',
 'scrapy.extensions.logstats.LogStats']
Unhandled error in Deferred:
[2021-10-28 14:31:12,498][CRITICAL] Unhandled error in Deferred:
2021-10-28 14:31:12 [twisted] CRITICAL: Unhandled error in Deferred:

Traceback (most recent call last):
  File "/home/user/.local/lib/python3.7/site-packages/scrapy/crawler.py", line 192, in crawl
    return self._crawl(crawler, *args, **kwargs)
  File "/home/user/.local/lib/python3.7/site-packages/scrapy/crawler.py", line 196, in _crawl
    d = crawler.crawl(*args, **kwargs)
  File "/home/user/.local/lib/python3.7/site-packages/twisted/internet/defer.py", line 1613, in unwindGenerator
    return _cancellableInlineCallbacks(gen)
  File "/home/user/.local/lib/python3.7/site-packages/twisted/internet/defer.py", line 1529, in _cancellableInlineCallbacks
    _inlineCallbacks(None, g, status)
--- <exception caught here> ---
  File "/home/user/.local/lib/python3.7/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks
    result = g.send(result)
  File "/home/user/.local/lib/python3.7/site-packages/scrapy/crawler.py", line 87, in crawl
    self.engine = self._create_engine()
  File "/home/user/.local/lib/python3.7/site-packages/scrapy/crawler.py", line 101, in _create_engine
    return ExecutionEngine(self, lambda _: self.stop())
  File "/home/user/.local/lib/python3.7/site-packages/scrapy/core/engine.py", line 69, in __init__
    self.downloader = downloader_cls(crawler)
  File "/home/user/.local/lib/python3.7/site-packages/scrapy/core/downloader/__init__.py", line 83, in __init__
    self.middleware = DownloaderMiddlewareManager.from_crawler(crawler)
  File "/home/user/.local/lib/python3.7/site-packages/scrapy/middleware.py", line 53, in from_crawler
    return cls.from_settings(crawler.settings, crawler)
  File "/home/user/.local/lib/python3.7/site-packages/scrapy/middleware.py", line 35, in from_settings
    mw = create_instance(mwcls, settings, crawler)
  File "/home/user/.local/lib/python3.7/site-packages/scrapy/utils/misc.py", line 167, in create_instance
    instance = objcls.from_crawler(crawler, *args, **kwargs)
  File "/home/user/Downloads/wildsearch-crawler-master/crawler/wildsearch_crawler/middlewares.py", line 184, in from_crawler
    proxy_list = str.split(proxy_list_string, ',')
builtins.TypeError: descriptor 'split' requires a 'str' object but received a 'NoneType'
hemantic commented 2 years ago

Эта ошибка говорит о том, что у вас не настроены конфигурации для ротирующихся прокси. Для локального запуска это не нужно, поэтому можно сделать следующее:

  1. Откройте файл wildsearch_crawler/settings.py
  2. Закомментируйте в нем 56 строку 'wildsearch_crawler.middlewares.RotatingProxyMiddleware': 610,

Эта ошибка больше вылезать не будет, но заработает ли парсер в полном объеме – не могу гарантировать, он не обновлялся уже около года.

berlinhemi commented 2 years ago

Ааа да, согласен, спасибо! Ошибка полечилась, но данные не выгрузились (в файл artifacts/wb.json). Прикреплю лог, может будут идеи, если нет, то попробую покурить разметку и код :)

scrapy crawl wb -o artifacts/wb.json -a category_url="https://www.wildberries.ru/catalog/zhenshchinam/odezhda/vodolazki"

[2021-10-29 13:52:46,601][INFO] Scrapy 2.4.0 started (bot: wildsearch_crawler)
2021-10-29 13:52:46 [scrapy.utils.log] INFO: Scrapy 2.4.0 started (bot: wildsearch_crawler)
[2021-10-29 13:52:46,610][INFO] Versions: lxml 4.6.3.0, libxml2 2.9.10, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 20.3.0, Python 3.7.3 (default, Jan 22 2021, 20:04:44) - [GCC 8.3.0], pyOpenSSL 21.0.0 (OpenSSL 1.1.1d  10 Sep 2019), cryptography 2.6.1, Platform Linux-4.19.0-6-amd64-x86_64-with-debian-10.11
2021-10-29 13:52:46 [scrapy.utils.log] INFO: Versions: lxml 4.6.3.0, libxml2 2.9.10, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 20.3.0, Python 3.7.3 (default, Jan 22 2021, 20:04:44) - [GCC 8.3.0], pyOpenSSL 21.0.0 (OpenSSL 1.1.1d  10 Sep 2019), cryptography 2.6.1, Platform Linux-4.19.0-6-amd64-x86_64-with-debian-10.11
[2021-10-29 13:52:46,610][DEBUG] Using reactor: twisted.internet.epollreactor.EPollReactor
2021-10-29 13:52:46 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.epollreactor.EPollReactor
[2021-10-29 13:52:46,617][INFO] Overridden settings:
{'BOT_NAME': 'wildsearch_crawler',
 'NEWSPIDER_MODULE': 'wildsearch_crawler.spiders',
 'SPIDER_MODULES': ['wildsearch_crawler.spiders']}
2021-10-29 13:52:46 [scrapy.crawler] INFO: Overridden settings:
{'BOT_NAME': 'wildsearch_crawler',
 'NEWSPIDER_MODULE': 'wildsearch_crawler.spiders',
 'SPIDER_MODULES': ['wildsearch_crawler.spiders']}
[2021-10-29 13:52:46,632][INFO] Telnet Password: 5b207ade21241c37
2021-10-29 13:52:46 [scrapy.extensions.telnet] INFO: Telnet Password: 5b207ade21241c37
[2021-10-29 13:52:46,657][INFO] Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.memusage.MemoryUsage',
 'scrapy.extensions.feedexport.FeedExporter',
 'scrapy.extensions.logstats.LogStats']
2021-10-29 13:52:46 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.memusage.MemoryUsage',
 'scrapy.extensions.feedexport.FeedExporter',
 'scrapy.extensions.logstats.LogStats']
[2021-10-29 13:52:46,709][INFO] Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'wildsearch_crawler.middlewares.BanDetectionMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
2021-10-29 13:52:46 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'wildsearch_crawler.middlewares.BanDetectionMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
[2021-10-29 13:52:46,712][INFO] Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
2021-10-29 13:52:46 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
[2021-10-29 13:52:46,713][INFO] Enabled item pipelines:
[]
2021-10-29 13:52:46 [scrapy.middleware] INFO: Enabled item pipelines:
[]
[2021-10-29 13:52:46,713][INFO] Spider opened
2021-10-29 13:52:46 [scrapy.core.engine] INFO: Spider opened
[2021-10-29 13:52:46,716][INFO] Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2021-10-29 13:52:46 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
[2021-10-29 13:52:46,717][INFO] Telnet console listening on 127.0.0.1:6023
2021-10-29 13:52:46 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
[2021-10-29 13:52:46,937][DEBUG] Crawled (200) <GET https://www.wildberries.ru/catalog/zhenshchinam/odezhda/vodolazki> (referer: None)
2021-10-29 13:52:46 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.wildberries.ru/catalog/zhenshchinam/odezhda/vodolazki> (referer: None)
[2021-10-29 13:52:47,153][INFO] Closing spider (finished)
2021-10-29 13:52:47 [scrapy.core.engine] INFO: Closing spider (finished)
[2021-10-29 13:52:47,154][DEBUG] Get 'SCRAPY_JOB' casted as 'None'/'None' with default '0'
2021-10-29 13:52:47 [/home/user/.local/lib/python3.7/site-packages/envparse.py] DEBUG: Get 'SCRAPY_JOB' casted as 'None'/'None' with default '0'
[2021-10-29 13:52:47,154][INFO] Dumping Scrapy stats:
{'downloader/request_bytes': 256,
 'downloader/request_count': 1,
 'downloader/request_method_count/GET': 1,
 'downloader/response_bytes': 69415,
 'downloader/response_count': 1,
 'downloader/response_status_count/200': 1,
 'elapsed_time_seconds': 0.438207,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2021, 10, 29, 17, 52, 47, 153763),
 'log_count/DEBUG': 2,
 'log_count/INFO': 10,
 'memusage/max': 62980096,
 'memusage/startup': 62980096,
 'response_received_count': 1,
 'scheduler/dequeued': 1,
 'scheduler/dequeued/memory': 1,
 'scheduler/enqueued': 1,
 'scheduler/enqueued/memory': 1,
 'start_time': datetime.datetime(2021, 10, 29, 17, 52, 46, 715556)}
2021-10-29 13:52:47 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 256,
 'downloader/request_count': 1,
 'downloader/request_method_count/GET': 1,
 'downloader/response_bytes': 69415,
 'downloader/response_count': 1,
 'downloader/response_status_count/200': 1,
 'elapsed_time_seconds': 0.438207,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2021, 10, 29, 17, 52, 47, 153763),
 'log_count/DEBUG': 2,
 'log_count/INFO': 10,
 'memusage/max': 62980096,
 'memusage/startup': 62980096,
 'response_received_count': 1,
 'scheduler/dequeued': 1,
 'scheduler/dequeued/memory': 1,
 'scheduler/enqueued': 1,
 'scheduler/enqueued/memory': 1,
 'start_time': datetime.datetime(2021, 10, 29, 17, 52, 46, 715556)}
[2021-10-29 13:52:47,154][INFO] Spider closed (finished)
2021-10-29 13:52:47 [scrapy.core.engine] INFO: Spider closed (finished)
afedotowaa commented 2 years ago

the same +

afedotowaa commented 2 years ago

Ааа да, согласен, спасибо! Ошибка полечилась, но данные не выгрузились (в файл artifacts/wb.json). Прикреплю лог, может будут идеи, если нет, то попробую покурить разметку и код :)

scrapy crawl wb -o artifacts/wb.json -a category_url="https://www.wildberries.ru/catalog/zhenshchinam/odezhda/vodolazki"

[2021-10-29 13:52:46,601][INFO] Scrapy 2.4.0 started (bot: wildsearch_crawler)
2021-10-29 13:52:46 [scrapy.utils.log] INFO: Scrapy 2.4.0 started (bot: wildsearch_crawler)
[2021-10-29 13:52:46,610][INFO] Versions: lxml 4.6.3.0, libxml2 2.9.10, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 20.3.0, Python 3.7.3 (default, Jan 22 2021, 20:04:44) - [GCC 8.3.0], pyOpenSSL 21.0.0 (OpenSSL 1.1.1d  10 Sep 2019), cryptography 2.6.1, Platform Linux-4.19.0-6-amd64-x86_64-with-debian-10.11
2021-10-29 13:52:46 [scrapy.utils.log] INFO: Versions: lxml 4.6.3.0, libxml2 2.9.10, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 20.3.0, Python 3.7.3 (default, Jan 22 2021, 20:04:44) - [GCC 8.3.0], pyOpenSSL 21.0.0 (OpenSSL 1.1.1d  10 Sep 2019), cryptography 2.6.1, Platform Linux-4.19.0-6-amd64-x86_64-with-debian-10.11
[2021-10-29 13:52:46,610][DEBUG] Using reactor: twisted.internet.epollreactor.EPollReactor
2021-10-29 13:52:46 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.epollreactor.EPollReactor
[2021-10-29 13:52:46,617][INFO] Overridden settings:
{'BOT_NAME': 'wildsearch_crawler',
 'NEWSPIDER_MODULE': 'wildsearch_crawler.spiders',
 'SPIDER_MODULES': ['wildsearch_crawler.spiders']}
2021-10-29 13:52:46 [scrapy.crawler] INFO: Overridden settings:
{'BOT_NAME': 'wildsearch_crawler',
 'NEWSPIDER_MODULE': 'wildsearch_crawler.spiders',
 'SPIDER_MODULES': ['wildsearch_crawler.spiders']}
[2021-10-29 13:52:46,632][INFO] Telnet Password: 5b207ade21241c37
2021-10-29 13:52:46 [scrapy.extensions.telnet] INFO: Telnet Password: 5b207ade21241c37
[2021-10-29 13:52:46,657][INFO] Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.memusage.MemoryUsage',
 'scrapy.extensions.feedexport.FeedExporter',
 'scrapy.extensions.logstats.LogStats']
2021-10-29 13:52:46 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.memusage.MemoryUsage',
 'scrapy.extensions.feedexport.FeedExporter',
 'scrapy.extensions.logstats.LogStats']
[2021-10-29 13:52:46,709][INFO] Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'wildsearch_crawler.middlewares.BanDetectionMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
2021-10-29 13:52:46 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'wildsearch_crawler.middlewares.BanDetectionMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
[2021-10-29 13:52:46,712][INFO] Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
2021-10-29 13:52:46 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
[2021-10-29 13:52:46,713][INFO] Enabled item pipelines:
[]
2021-10-29 13:52:46 [scrapy.middleware] INFO: Enabled item pipelines:
[]
[2021-10-29 13:52:46,713][INFO] Spider opened
2021-10-29 13:52:46 [scrapy.core.engine] INFO: Spider opened
[2021-10-29 13:52:46,716][INFO] Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2021-10-29 13:52:46 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
[2021-10-29 13:52:46,717][INFO] Telnet console listening on 127.0.0.1:6023
2021-10-29 13:52:46 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
[2021-10-29 13:52:46,937][DEBUG] Crawled (200) <GET https://www.wildberries.ru/catalog/zhenshchinam/odezhda/vodolazki> (referer: None)
2021-10-29 13:52:46 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.wildberries.ru/catalog/zhenshchinam/odezhda/vodolazki> (referer: None)
[2021-10-29 13:52:47,153][INFO] Closing spider (finished)
2021-10-29 13:52:47 [scrapy.core.engine] INFO: Closing spider (finished)
[2021-10-29 13:52:47,154][DEBUG] Get 'SCRAPY_JOB' casted as 'None'/'None' with default '0'
2021-10-29 13:52:47 [/home/user/.local/lib/python3.7/site-packages/envparse.py] DEBUG: Get 'SCRAPY_JOB' casted as 'None'/'None' with default '0'
[2021-10-29 13:52:47,154][INFO] Dumping Scrapy stats:
{'downloader/request_bytes': 256,
 'downloader/request_count': 1,
 'downloader/request_method_count/GET': 1,
 'downloader/response_bytes': 69415,
 'downloader/response_count': 1,
 'downloader/response_status_count/200': 1,
 'elapsed_time_seconds': 0.438207,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2021, 10, 29, 17, 52, 47, 153763),
 'log_count/DEBUG': 2,
 'log_count/INFO': 10,
 'memusage/max': 62980096,
 'memusage/startup': 62980096,
 'response_received_count': 1,
 'scheduler/dequeued': 1,
 'scheduler/dequeued/memory': 1,
 'scheduler/enqueued': 1,
 'scheduler/enqueued/memory': 1,
 'start_time': datetime.datetime(2021, 10, 29, 17, 52, 46, 715556)}
2021-10-29 13:52:47 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 256,
 'downloader/request_count': 1,
 'downloader/request_method_count/GET': 1,
 'downloader/response_bytes': 69415,
 'downloader/response_count': 1,
 'downloader/response_status_count/200': 1,
 'elapsed_time_seconds': 0.438207,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2021, 10, 29, 17, 52, 47, 153763),
 'log_count/DEBUG': 2,
 'log_count/INFO': 10,
 'memusage/max': 62980096,
 'memusage/startup': 62980096,
 'response_received_count': 1,
 'scheduler/dequeued': 1,
 'scheduler/dequeued/memory': 1,
 'scheduler/enqueued': 1,
 'scheduler/enqueued/memory': 1,
 'start_time': datetime.datetime(2021, 10, 29, 17, 52, 46, 715556)}
[2021-10-29 13:52:47,154][INFO] Spider closed (finished)
2021-10-29 13:52:47 [scrapy.core.engine] INFO: Spider closed (finished)

решили вопрос?

berlinhemi commented 2 years ago

решили вопрос?

Нет, не садился за проект. Но если руки дойдут, отпишу.

Maxsar-S commented 2 years ago

Получилось?

berlinhemi commented 2 years ago

Получилось?

Не, не садился за этот проект