-
vaulstein: Can you add a mechanism to FLUSH the queue when the crawl is completed so that we can pause/resume even in distributed crawling?
Request from https://github.com/rolando/scrapy-redis/issues…
-
We need a set of Rest services to be able to pass crawl requests into the Kafka API to be processed by the Kafka Monitor. Ideally this uses something small like Flask and will run on a server that has…
-
Here's what we have in original
```
dupefilter_cls = load_object(settings['DUPEFILTER_CLASS'])
dupefilter = dupefilter_cls.from_settings(settings)
```
in redis version the class name is hard…
-
I installed scrapy-redis using pip and cloned this project to run the examples. But got the error.
Here are the messages:
```
2016-05-18 16:29:03 [scrapy] INFO: Overridden settings: {'NEWSPIDER_MODU…
-
well I start a nginx-1.11.5
when I connect the server by chrome and FireFox
I can see the access.log as follows
"GET / HTTP/2.0" 200 728 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_0) AppleWebK…
-
We're rolling out a rewrite of the `Microsoft.OSTCExtensions.CustomScriptForLinux` extension. Pertinent to internal issue, opening an issue to track progress of the rollout in this repository.
# Summa…
-
Scrapy Cluster should have three or four distinct pip packages that allow a user to run `pip install scrapy-cluster` to get all available packages set up, or to allow individual component management l…
-
Hello,
I was trying to implement the scrapy-redis, the problem was discussed over there but after some debugging I see that the problem could be in scrapy.
The redis_scheduler is simple: https://git…
-
When I tried to run "sudo pip install -r requirements.txt", it seems this "Running setup.py install for cryptography ... error" occurs every time.
I run it on Ubuntu on AWS EC2, and I have no idea h…
-
Hi - I've followed Learning Scrapy's instructions in the appendix for Ubuntu 14.04.4 LTS, without success.
- `docker` is installed properly (confirmed with `docker run hello-world` and `docker -ps`)
-…