custom-crawler Search Results

1000+ results
for custom-crawler

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

scrapy/scrapy #5510

Support per-request download handler override

It would be great if a plugin like https://github.com/scrapy-plugins/scrapy-playwright did not had to force you to drive all requests through its download handlers, and instead you could drive certain…

Gallaecio updated 3 months ago
9
Letractively/harvestman-crawler #10

RSS Integration

``` Provide RSS integration feature to the crawler. RSS Integration will allow for, 1. As a trigger to start/restart website crawling/indexing based on RSS feed updates. 2. To implement an RSS…

GoogleCodeExporter updated 8 years ago
4
pythonhacker/harvestman-crawler #10

RSS Integration

``` Provide RSS integration feature to the crawler. RSS Integration will allow for, 1. As a trigger to start/restart website crawling/indexing based on RSS feed updates. 2. To implement an RSS…

GoogleCodeExporter updated 9 years ago
4
fffunction/backstop-crawl #29

Add option to pass configuration to simplecrawler directly

There are a lot of options in simplecrawler. Might be useful to allow passing options directly to the crawler for really custom setups. Like limiting crawl depth or number of pages to crawl, etc.

minorOffense updated 4 years ago
1
datalad-datasets/ratholeradio-archive #2

Crawler pipeline broken

The pipeline doesn't work anymore: ```sh /tmp/ratholeradio-archive (git)-[master] % datalad crawl [INFO ] Loading pipeline definition from ./.datalad/crawl/pipelines/pipeline.py [ERROR ] Fai…

mih updated 6 years ago
1
skeletonlabs/skeleton #2362

Cookbook: Table of Contents

> [!WARNING] > This issue is a work in progress. This will act as a hub to centralize this information. ## Maintainer Requests The following requests are coming straight from the Skeleton t…

endigo9740 updated 1 month ago
3
openzim/zim-requests #769

New request: BBC Persian

This is a special request for Zimit 2.0 project. Devs will handle this first to test the new scraper, and only once it's working it will be transfered to content team. - Website URL: https://www.bb…

benoit74 updated 1 week ago
20
jungjonghun/crawler4j #261

Crawler4j missing more control over retry count

``` What steps will reproduce the problem? 1. Run the Basic Crawler with RobotServer enabled 2. Have "addeasy.netfirms.com" as the seed What is the expected output? What do you see instead? Expectati…

GoogleCodeExporter updated 9 years ago
1
AllenTyf/memcached #414

memcached 1.4.24 segfaults

``` What steps will reproduce the problem? 1. SLES 11.3 with slightly patched 3.16 kernel Linux memcached9 3.16.3-4.1.100-default #1 SMP Thu Sep 18 06:32:16 UTC 2014 (d2bbe7f) x86_64 x86_64 x86_64 GN…

GoogleCodeExporter updated 8 years ago
2
espandy/memcached #414

memcached 1.4.24 segfaults

``` What steps will reproduce the problem? 1. SLES 11.3 with slightly patched 3.16 kernel Linux memcached9 3.16.3-4.1.100-default #1 SMP Thu Sep 18 06:32:16 UTC 2014 (d2bbe7f) x86_64 x86_64 x86_64 GN…

GoogleCodeExporter updated 9 years ago
2

上一页 1...3 4 5 6 7 8 9...100 下一页

1000+ results for custom-crawler

1000+ results
for custom-crawler