-
I'm setting the headers following way
```python
headers = {
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'cache-control': 'no-cache',
...
}…
-
大佬。是不是改这几个地方就可以使用了
'Accept':
'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language': 'zh-CN,zh;q=0.9,en;q=0.8,en-US;q=0.7',
'cookie': 'your cookie'
…
-
✗ scrapy run
Traceback (most recent call last):
File "/usr/local/bin/scrapy", line 8, in
sys.exit(execute())
File "/Users/noname/Library/Python/3.8/lib/python…
-
Hi!
Is it possible to set deltafetch stop scrapy crawling when encountering a visited link?
I really need this!
-
Hi there, I tried to run your scrapy script but there was no results.
I have also created a SQL DB with table name goods_info but I'm still having issue. Can you help me out?
Connect to db successful…
-
### Description
A crawl with a feed file format `'feed-%(batch_id)s.jl'` will writes feed files `feed-1.jl`, `feed-2.jl`, but will overwrite those same files when restarting using the JOBDIR parame…
-
就像scrapy的信号机制那样?
我的项目实际运行过程中有两个问题, 代理和token都有限制, 我需要在middleware中先判断两者是否有效, 都有效时修改url/headers/proxies等参数正常请求.
其中一个无效都需要停掉爬虫等待定时任务下次启动.
这个项目又用到batch_spider, 目前的解决方案有些复杂.
检测到代理或token失效时, 把task表中所有非1的…
-
### Description
The change to behavior of `Spider.allowed_domains` in 2.11.2 broke several of our crawls because it does not play well with downloader middlewares that replace the original request …
-
```
SQLite version 3.27.2 2019-02-25 16:06:06
Enter ".help" for usage hints.
sqlite> select parse_state, count(*) from RECIPES_LIST group by parse_state;
0|11448
1|4712
2|1885
3|655
sqlite>
`…
-
### Description
When you hit the SCRAPER_SLOT_MAX_ACTIVE_SIZE requests stop being processed silently with no warning.
If you are deferring items in a pipeline that depend on other requests fin…
djay updated
3 months ago