-
### Description
Recently a spider I made to crawl cragislist for rental listings broke. When I checked the logs, it turns out that all of my requests were hitting http 403 error codes. Of course, I…
pmdbt updated
3 months ago
-
In single node scrapy project, the settings like below as your document indicate works well.
```
# ====== Splash settings ======
SPLASH_URL = 'http://localhost:8050'
DOWNLOADER_MIDDLEWARES = {
…
-
### Description
There is Extreme performance of the waste When CONCURRENT_ITEMS set to a large number, such as 9999.
Some days ago, I wrote a spider with CONCURRENT_ITEMS=9999, and run it. I fo…
-
Currently Scrapy can't extract links from http://scrapy.org/ page correctly because urls in page header are relative to a non-existing parent: `../download/`, `../doc/`, etc. Browsers resolve these li…
kmike updated
2 weeks ago
-
Hello, I created a portia spider and downloaded it as scrapy project. I run Windows 10 with docker and scrapyd is running it's service on port 6800 properly. However, when I schedule a spider to run,…
ghost updated
4 years ago
-
I am new in scrapy, and I meet some problems which I can not get answer from google, so I post it here:
1 Cookie not work even set in DEFAULT_REQUEST_HEADERS:
```
DEFAULT_REQUEST_HEADERS = {
'Ac…
-
I want to support Scrapy take list type arguments that is just a text split by some commas like these::
```
scrapy crawl a_spider -a arg1=text1,text2,text3
```
In this case, I would write a splittin…
-
```
Traceback (most recent call last):
File "c:\scraper\venv\lib\site-packages\scrapy\core\spidermw.py", line 106, in process_sync
for r in iterable:
File "c:\scraper\src\funcs.py", line 1…
-
I was wonder why the `make_request_from_data` method doesn't just use the more full featured `request_from_dict` function from scrapy utils that the queue class in the library already uses? It seems t…
-
### Description
The spider_error signal is not called when receiving an exception from DownloaderMiddleware. This is different from similar behavior for other scrapy components. I have not found an…