-
Extension of https://github.com/scrapy/scrapy/issues/1015 - spider exceptions don't trigger `process_spider_exception` if they're called from an `errback` method.
```
import logging
from scra…
-
`Scrapy` offers an HTTP API through a third-party library called `ScrapyRT`, which exposes an HTTP API for spiders. By sending a request to `ScrapyRT` with the spider name and URL, you receive the ite…
-
```
2024-06-22 22:27:27 [scrapy.core.scraper] ERROR: Spider error processing (referer: None)
Traceback (most recent call last):
File "/home/ubuntu/.pyenv/versions/3.11.9/lib/python3.11/site-pack…
-
The only parts of Scrapy that we take advantage of are the scheduler and the downloader. Its management of crawlers and spiders doesn't add anything to our usecase, and the abstractions provided add c…
-
### Description
According to the [documentation](https://docs.scrapy.org/en/latest/topics/feed-exports.html#feeds), the `FEEDS` dict accepts `Path` objects as keys:
> [...] dictionary in whi…
-
Right now the Scrapy Spider Workflow #7 gets trigger on every push (pr and merge) it should only work with pr's.
-
# Description
If i insert start url to redis before run scrapy, is successful.
But if i run scrapy first and insert url, listen url will get fail info:
```
2023-08-13 17:11:59 [scrapy.utils.…
-
### Description
When setting cookies on a request, you can specify a domain. If you set the domain to "localhost" or any IPV4 address, it won't get set on requests for "localhost"/the IPV4 address.…
-
### Description
The `OffsiteMiddleware` logs a single message for each domain filtered. Great!
But then the `core.engine` logs a message for every single url filtered by the OffsiteMiddleware.
(L…
-
C:\Users\33721\Desktop\weibo-search-master>scrapy crawl search -s JOBDIR=crawls/search
2024-05-18 12:51:09 [scrapy.core.scraper] ERROR: Spider error processing (referer: https://s.weibo.com/weibo?…