-
Recently I found that my crawler is generating some errors with the HTTPCache extension.
Here is the config of HTTPCache:
```
HTTPCACHE_ENABLED = True
HTTPCACHE_EXPIRATION_SECS = 0
HTTPCACHE…
-
2022-02-16 14:06:22 [scrapy.core.scraper] ERROR: Spider error processing (referer: None)
Traceback (most recent call last):
File "/home/vsi/.local/lib/python3.7/site-packages/scrapy/utils/defer.p…
-
I have a proxy running on localhost:8090 that works with Selenium. I am trying to get Splash to work and the proxy is not being used at all. When the proxy is running, I can see all traffic through it…
-
I sometimes get this error when i use scrapy-pilaywright
```
2023-03-31 09:33:35 [asyncio] ERROR: Task was destroyed but it is pending!
source_traceback: Object created at (most recent call las…
-
First off, thanks for this. Saved me a lot of digging.
Second, I think this should return an HtmlResponse object, rather than a TextResponse object.
When using the scrapy crawl spider, rather than…
-
After
```
$ scrapy crawl leboncoin_property -a start_urls="http://www.leboncoin.fr/ventes_immobilieres/offres/languedoc_roussillon/herault/" -o properties.json
```
`properties.json` is empty.
…
-
When making a test for Python's scrapy library, the rrtest create command runs forever. It looks like it's getting stuck in Python's subprocess library.
Here is the command `rrtest create --name sc…
-
Similar to #30 , but i use latest version of scrapy-fake-useragent 1.4.4
here is my `setting.py` :
```
DOWNLOADER_MIDDLEWARES = {
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware'…
-
I'm trying to pass requests to the spider externally, via message queues, and keep it running forever.
I found some projects made by others but none of them work for the current version of scrapy, …
-
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
D:\weibo-search-master\weibo-search-master\wei…