rmax / scrapy-inline-requests

A decorator to write coroutine-like spider callbacks.
MIT License
110 stars 27 forks source link

Add a note about using `twisted.internet.defer.inlineCallbacks` directly instead of `inline_requests` #10

Open rmax opened 8 years ago

poupryc commented 4 years ago

Could you provide an example please? I try to use @inlineCallbacks without success.

rmax commented 4 years ago

@HelloEdit If I remember correctly, if you use @inlineCallbacks then you have to return deferreds and to return the final item you would use defer.returnValue(item).

poupryc commented 4 years ago

I think I've managed otherwise, without inline_request.

# -*- coding: utf-8 -*-
import scrapy
import re
from twisted.internet.defer import inlineCallbacks

from sherlock import utils, items, regex

class PagesSpider(scrapy.spiders.SitemapSpider):
    name = 'pages'
    allowed_domains = ['thing.com']
    sitemap_follow = [r'sitemap_page']

    def __init__(self, site=None, *args, **kwargs):
        super(PagesSpider, self).__init__(*args, **kwargs)

    @inlineCallbacks
    def parse(self, response):
        # things
        request = scrapy.Request("https://google.com")
        response = yield self.crawler.engine.download(request, self) 
        # Twisted execute the request and resume the generator here with the response
        print(response.text)