-
Hi
TLDR: ItemLoader does not reuse response.selector when you are passing response to it as argument. And looks like it was reusing it up to Scrapy==1.0
Recently we were trying to upgrade our s…
-
```
E:\github\scrapy-HousePricing>scrapy crawl anjuke
2017-05-26 09:32:45 [scrapy.utils.log] INFO: Scrapy 1.4.0 started (bot: houseData)
2017-05-26 09:32:45 [scrapy.utils.log] INFO: Overridden sett…
-
C:\Users\14352\.conda\envs\reptile\python.exe F:\spider\wiki_real\wiki爬取教程\counselor\main.py
2023-10-17 20:35:17 [scrapy.utils.log] INFO: Scrapy 2.11.0 started (bot: counselor)
2023-10-17 20:35:17 …
-
依赖已经全部安装了 但不知道为何还是会报错
2024-07-08 15:46:54 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
Traceback (most recent call last):
File "D:\WeiboSpider\weibospider\run_spide…
-
最近遇到scrapy分布式处理的问题,不知道改如何处理,master中的url如何产生,slave中爬去到的url如何存入master中,在实际的spider中,RedisSpider RedisMixin 如何与scrapy中的spider做好协调(如何使用scrapy-redis中的spider) 希望能得到您的帮助
-
我在Ubuntu 12.04中按照说明配置了单mongodb的环境,但是在运行时报错如下:
/woaidu_crawler/woaidu_crawler/spiders/woaidu_detail_spider.py:12: ScrapyDeprecationWarning: woaidu_crawler.spiders.woaidu_detail_spider.WoaiduSpider inh…
-
Currently offsite middleware reads allowed domains from spider attribute on spider opened and uses that to decide whether request should be followed or not.
https://github.com/scrapy/scrapy/blob/1…
-
I want to create `JOBDIR `setting from Spider `__init__` or dynamically when I call that spider .
I want to create different `JOBDIR` for different spiders , like `FEED_URI` in the below example
…
lxmn updated
8 months ago
-
### Description
`scrapy.shell.inspect_response` does not work with the `asyncio` reactor when using the `ipython` shell
### Steps to Reproduce
1. Create a spider with the following contents:
…
-
Scrapy cookiejar API is limited:
- meta key is called `cookiejar`, but you can't put CookieJar object there, in fact it means `cookiejar_id` or `session_id`, not `cookiejar`; this is confusing. It sho…
kmike updated
4 months ago