rmax / scrapy-redis

Redis-based components for Scrapy.
http://scrapy-redis.readthedocs.io
MIT License
5.54k stars 1.59k forks source link

警告: Passing a 'spider' argument to ExecutionEngine.crawl is deprecated #266

Closed maintain99 closed 1 year ago

maintain99 commented 1 year ago

请问大佬这个警告是什么意思 在使用scrapy-redis时出现了这个警告 2022-12-10 21:09:02 [py.warnings] WARNING: C:\Users\wsy\AppData\Roaming\Python\Python310\site-packages\scrapy_redis\spiders .py:197: ScrapyDeprecationWarning: Passing a 'spider' argument to ExecutionEngine.crawl is deprecated self.crawler.engine.crawl(req, ### ### spider=self)

LuckyPigeon commented 1 year ago

@maintain99 你好,這是 Scrapy 新版提出的改進。簡單來說,在 self.spider 裡面直接寫入你要呼叫的 spider 就行了,不需要傳參數進去,一來方便,二來安全。

maintain99 commented 1 year ago

@LuckyPigeon 我该怎么改能详细说一下吗,谢谢

LuckyPigeon commented 1 year ago

@maintain99 我需要你的 code 才能詳細解說,但照理說你只要在 parse 內或任何你呼叫 scrapy 的函數內指定 self.spider = ExampleSpider 就行了。 但如果你對 spider 還不熟悉,建議你不要管 warning 了,還是能過的

maintain99 commented 1 year ago

@LuckyPigeon

有警告看着很不爽

` import scrapy from scrapy_redis.spiders import RedisSpider

class ShangZhiSpider(RedisSpider): name= 'shangzhi' allowed_domains = ['che168.com'] redis_key = 'chaosui_urls'

def parse(self, response,**kwargs):

    lis=response.xpath("//ul[@class='viewlist_ul']/li/a/@href").extract()
    for lj in lis:
        herf=response.urljoin(lj)
        yield scrapy.Request(
             url=herf,
             callback=self.jiexi
        )
        print(herf)

    lst=response.xpath('//div[@class="page fn-clear"]/a/@href').extract()[1:]
    for url in lst:
        urll=response.joinurl(url)
        yield scrapy.Request(
            url=urll,
            callback=self.parse
        )

def jiexi(self, resp):
    titer=resp.xpath('//h3[@class="car-brand-name"]/text()').extract()
    print(titer[0].strip())
    print(resp.url)

`