Closed kevingoss2 closed 2 years ago
When stopping a job, Scrapyd sends a TERM signal. If you check the log file for the spider/crawl, you should see that it responds with:
Received SIGTERM, shutting down gracefully. Send again to force
So, Scrapy (not Scrapyd) will attempt to shut down the spider gracefully, which can sometimes take a long time if there was a lot of processing already being done in the engine.
If you cancel the job a second time via Scrapyd, the Scrapy will force the spider to stop. Can you try that?
Related to https://github.com/scrapy/scrapy/issues/4749
Closing as no response to question in several months.
I have several machines spidering with scrapyd and I am monitoring and managing them via the scrapyd api. I love the software, but... I cannot seem to cancel jobs. I make the call to the cancel API and get:
'{"node_name": "spider8", "status": "ok", "prevstate": "running"}
It says "ok" so I know that:
i hit the right node
the spider is running
When I get the history/running/pending on the nodes, I notice there are several instances of many of the spiders running. It happens to all of the spiders, but randomly as to which spider and which server.
Example of "running" output from the server. The jobID and project sent in ARE correct and I do get the correct response to the cancel API, but here is one that has been running for days and it is a spider that finishes in scrapy in minutes.
{'id': 'a91cf65ae8fa11ea9764d1edb3dcaa77', 'spider': 'PopularScience', 'pid': 1971, 'start_time': '2020-08-28 06:49:49.105971'}