Closed elliotgao2 closed 7 years ago
def generate_header(): header = {'User-agent': 'Google spider'} return header class MySpider(Spider): start_url = 'https://blog.scrapinghub.com/' header = generate_header concurrency = 5 parsers = [Parser('https://blog.scrapinghub.com/page/\d+/'), Parser('https://blog.scrapinghub.com/\d{4}/\d{2}/\d{2}/[a-z0-9\-]+/', Post)]
and
class MySpider(Spider): start_url = 'https://blog.scrapinghub.com/' header = {'User-agent': 'Google spider'} concurrency = 5 parsers = [Parser('https://blog.scrapinghub.com/page/\d+/'), Parser('https://blog.scrapinghub.com/\d{4}/\d{2}/\d{2}/[a-z0-9\-]+/', Post)]
Both should be supported.
and
Both should be supported.