crawlab Search Results - Githubissues

428 results
for crawlab

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

crawlab-team/crawlab #1062

Integrate URLFrontier as a backend for URL storage

Hi, This is a question, not a bug report. [url-frontier](https://github.com/crawler-commons/url-frontier) is an API to define a [crawl frontier](https://en.wikipedia.org/wiki/Crawl_frontier). …

jnioche updated 1 year ago
4
crawlab-team/crawlab #1304

crawlee support

Are you planning to add maybe https://github.com/apify/crawlee? He has popetter, got scraping and many more, written in JS.

KULTI1995 updated 1 year ago
2
crawlab-team/crawlab #1180

dial tcp 127.0.0.1:8000: socket: too many open files

**Bug 描述** 当同时启动的任务多时，爬虫启动全部失败，报 62fddbdd4f6290e5182ab109/oreo/spiders/henan/__init__.py": dial tcp 127.0.0.1:8000: socket: too many open files错误。 **复现步骤** 该 Bug 复现步骤如下 1. 同时启动很多爬虫，例如同时启动30个sc…

liziqiang0523 updated 1 year ago
2
crawlab-team/crawlab #1262

Task in schedule is pending due to call to removed worker no…

**Describe the bug** I have some worker nodes and schedule for run task on this worker nodes. Due to some reason, some worker nodes is removed and create new worker nodes. But when schedule run, th…

thachnv92 updated 1 year ago
7
crawlab-team/crawlab #1293

社区版，我按照docker-compose文件配置了mongo，确认crawlab-sdk安装了，但是在使用save_i…

Dyf8205 updated 1 year ago
3
crawlab-team/crawlab #1272

请教作者

想请教下作者，开发这个crawlab爬虫管理平台需要用到哪些技术栈或知识储备呢。来自一名学生

Will-Liang updated 1 year ago
4
crawlab-team/crawlab #1214

Crawlab 社区版任务日志超过100页BUG

**Bug 描述** 当任务日志超过100页或者更多页时自动跳转第一页，无法实时显示最新的日志 **复现步骤** 该 Bug 复现步骤如下 1.运行爬虫打印日志使其超过100页或更多页 2. 它会短暂的跳转最新页(但是输出的内容是第一页)，然后强行跳转第一页 3. 周而复始 **期望结果** 希望他能正常的显示日志

snackdeng updated 1 year ago
1
crawlab-team/crawlab #1126

Bug: upgrading from 0.5.1 to 0.6.1: panic: runtime error: in…

**Describe the bug** The worker and crawlab master can keep restarting after the upgrade. **To Reproduce** Steps to reproduce the behavior: 1. Using 0.5.1 crawlab setup, put the dockers down: `…

franciscopaniskaseker updated 1 year ago
6
crawlab-team/crawlab #568

Add support for downloading media files and large files.

Crawlab can't download media files and large files (e.g. jpg files, mp3 files, gif files, zip files and so on.) A viable approach is to add one type of node. Let's say, MediaWorker. The links of th…

Neutrino3316 updated 1 year ago
3
crawlab-team/crawlab #1271

⚠️⚠️⚠️ mongo 和mysql 数据库连接不释放问题，严重问题请尽快解决

**Bug 描述** 例如，当采集器任务完成时数据库连接没有回收，数据库连接一直属于Sleep 状态，入库通过pyhon 的crawlab库的save_item 单采集器任务连接占用上千，通过save_items 批量写入也会占用几百的数据库连接，多采集器任务采集时，数据库连接很快就会被crawlab使用完，导致影响正常业务❌❌❌。严重影响正常业务，希望马上解决，mongo和mysql都…

TigerShuai updated 1 year ago
3

上一页 1...12 13 14 15 16 17 18...43 下一页

428 results for crawlab

428 results
for crawlab