-
**High Memory Usage with ScrapydWeb**
I've observed an issue where ScrapydWeb causes excessive memory usage when running alongside Scrapyd. On an EC2 instance with 8GB of RAM and 2 vCPUs, Scrapyd a…
-
## 500 (INTERNAL SERVER ERROR): 500 Internal Server Error: The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the…
-
使用系统: centos7
问题复现
1.使用auto_manage_spiders.py提示正常上传,但系统上无显示
(base) [root@localhost scrapyd_web_manager]# python auto_manage_spiders.py -dp
deploy True
POST Fetch: http://192.168.1.94:5000/1/d…
-
**Describe the bug**
Do to this [change](https://github.com/scrapy/scrapyd/commit/3c7a0fc00a3bc62fb32836e76b446454947123fe) in v1.5.0 the regex here (https://github.com/my8100/scrapydweb/blob/8de7ede…
giido updated
1 month ago
-
Description:
When running Scrapyd with Python 3.11.9, the format of the HTML returned on the Jobs page appears to be standardized in a way that prevents successful parsing of job data. This issue doe…
-
As I first reported at https://github.com/mlcommons/croissant/issues/530#issuecomment-2096806017 I tried to follow the README under /heath but scrapydweb failed to launch. [err.txt](https://github.com…
-
I added the Heroku Postgres add on, which creates the DATABASE_URL
Once I restarted the scrapydweb server I got this error:
```
2020-05-23T07:13:33.128467+00:00 heroku[web.1]: Starting process …
-
linux:HTTPConnectionPool(host='192.168.0.24', port=6801): Max retries exceeded with url: /listprojects.json (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 111] Connectio…
-
爬虫情况描述:
1,我有28个定时爬虫,有的设置为10秒启动一次。累积到现在,有272071 条历史爬虫任务。
2,有的任务是redis_scrapy分布式任务,到现在也累计运行2个月,爬取item操作1000万条。
碰到的问题:
1,点击左侧菜单“timer tasks”,打开很慢,而且会有大概率崩溃。只能重新启动scrapydweb服务。
猜想的原因:可能是累积的历史任务太多…
-
**Describe the bug**
Running scrapyd instances across multiple servers at the moment requires either a complicated tunneling setup (like Wireguard) or directly exposing the scrapyd HTTP interface to …