-
See https://github.com/scrapy/scrapy/blob/master/scrapy/downloadermiddlewares/httpcompression.py#L36
IMHO the "Content-Encoding" header should get preserved, since the spider probably wants to see al…
-
Here is a silly example (put it into `bad_traceback.py` and run with `python3.4 bad_traceback.py`):
``` python
import scrapy
from twisted.internet.defer import inlineCallbacks
from scrapy.crawler imp…
-
I think we should make Spider.name attribute optional. The name is used by SpiderManager to find spiders, but Spider can be used without a Scrapy project. It is unnecessary boilerplate for users of ru…
kmike updated
7 months ago
-
Before I report this as a docs bug, I want to check:
https://docs.scrapy.org/en/latest/topics/request-response.html#topics-request-response-ref-errbacks says: "It receives a Failure as first parame…
-
### Description
Don't know if it should be considered as a bug but at least it should be the same behaviour if spider callback only raised an exception or raised an exception with yielding.
When r…
-
#### Setting up Scrapy
- Quick pip install will do the job
```
pip install scrapy
```
- Let's generate a fresh project for our Nobel-prize scraping, using the `startproject` option. This is going…
-
If exception is raised in parse method of a WebdriverResponse/WebdriverRequest whole spider closes/exits and doesnt continue
Steps to reproduce:
In any of your parse methods which parse WebDriverResp…
-
I'm following the instructions on your README.md file.
When I run `scrapy crawl tripadvisor-restaurant -o output/result.json -t json`, I get the following error:
```
2016-07-11 17:26:57 [scrapy] DE…
-
I am trying to scrape a website which has some dropdowns, So I planned to use Scrapy Framework with Scrapy-Selenium(more here) to click around the dropdowns(Nested For loop) and then capture the URL u…
-
```
Python 3.9.13
Daphne 4.0.0
Django 4.1.2
Channels 4.0.0
Scrapy 2.7.0
scrapy-playwright 0.0.22
```
My settings:
```python
DOWNLOAD_HANDLERS = {
"http": "scrapy_playwright.handler.Sc…