Closed NanZhang715 closed 9 months ago
it should support any spider, in your case you rely on executing start_requests which are disabled by default, but you can enable them: see here in docs:
https://scrapyrt.readthedocs.io/en/latest/api.html#scrapyrt-http-api
Whether spider should execute Scrapy.Spider.start_requests method. start_requests are executed by default when you run Scrapy Spider normally without ScrapyRT, but this method is NOT executed in API by default. By default we assume that spider is expected to crawl ONLY url provided in parameters without making any requests to start_urls defined in Spider class. start_requests argument overrides this behavior. If this argument is present API will execute start_requests Spider method.
Thank you, the issue is solved by passing the url with crawl_args
Hi,
I built a spider with CrawlSpider class
when using curl "localhost:9080/crawl.json?spider_name=web_spider&url=https://www.example.com", the URL is not passed to the spider, the error log shows that "start_urls is [[]]"" but cmd shown below works well
scrapy runspider web_spider.py -a url=https://www.example.com
Does scrapyrt support crawspider ?
Thanks