-
使用Github抓取博客链接、使用mongodb存储数据,在抓取阶段出现问题
`https://blog.akimio.top/links/`是用的是`butterfly`魔改主题(solitude)[https://github.com/everfu/hexo-theme-solitude],之前是可以正常抓取的,**一开始我怀疑是主题的问题,找了一个原版butterfly主题的友链,还是出现…
-
I get the following error while the consumer spider running:
```
RedisMixin.spider_idle of >
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/scrapy/utils/signal…
-
### Description
Don't know if it should be considered as a bug but at least it should be the same behaviour if spider callback only raised an exception or raised an exception with yielding.
When r…
-
```
Python 3.9.13
Daphne 4.0.0
Django 4.1.2
Channels 4.0.0
Scrapy 2.7.0
scrapy-playwright 0.0.22
```
My settings:
```python
DOWNLOAD_HANDLERS = {
"http": "scrapy_playwright.handler.Sc…
-
I'm getting this error:
ValueError: All proxies are unusable, cannot proceed
2017-05-13 14:09:02 [scrapy.utils.log] INFO: Scrapy 1.3.3 started (bot: scrapy_bets)
2017-05-13 14:09:02 [scrapy.uti…
-
Hi,
I am trying to run Frontera with Kafka as Message bus.
Ubuntu 16.04 LTS
Python 2.7.11
Frontera 0.5.1.1
I placed `BACKEND = 'frontera.contrib.backends.remote.kafka.KafkaOverusedBackend' in my sp…
-
--- ---
File "/usr/local/lib/python3.6/dist-packages/twisted/internet/base.py", line 878, in runUntilCurrent
call.func(*call.args, **call.kw)
File "/usr/local/lib/python3.6/dist-packages/s…
-
Installed scrapy with anaconda.
Created a project for goodreads
cd project
run scrapy crawl giveaway command
```
File "//anaconda2/lib/python2.7/site-packages/scrapy/spiderloader.py", line 71…
-
D:\weibo\text\weibo-search-master>scrapy crawl search -s JOBDIR=crawls/search
2022-12-16 22:42:13 [scrapy.core.scraper] ERROR: Spider error processing (referer: None)
Traceback (most recent call la…
-
Add crawl spiders for the following or popular websites.
- Youtube
- Quora
- Facebook
- Reddit
- GitHub
Currently implemented spiders can be found in - https://github.com/leopardslab/CrawlerX/…