-
I would like to ask a question, is multi-host scraping sth you would like to be implemented? I heard about scrapy-redis but it's not pure python therefore it can't be included in standard Scrapy.
-
I have been following this topic for a while now but it seems we don't have a clear understanding of the envisioned development timeline related to managing and monitoring the execution of one or more…
-
Seems to be an issue with building the opentype sanitizer as part of the make setup script;
```
vern@vern-ThinkCentre-M90:~/src/fontbakery$ VENVRUN=virtualenv make setup;
. venv/bin/activate; pip ins…
-
Fresh + clean Ubuntu 14.04 distro.
'VENVRUN=virtualenv make setup;' is failing with;
```
Command python setup.py egg_info failed with error code 1 in /home/vern/src/fontbakery/venv/build/cryptograph…
-
The scheduler doesn't seem to respect "allowed_domains" and the settings like "DOWNLOAD_DELAY". Everything works fine when you disable scrapy-redis scheduler in settings. Tried looking at the code but…
-
i have a demand, user submit url into redis queue, scrapy get url from redis and fetch it, but when redis queue became empty, scrapy stopped working. i hope scrapy will be blocked, not shutdown, i tr…
-
I get an error
'''
exceptions.AttributeError: 'Pipeline' object has no attribute 'multi'
'''
and if I comment code like this :
"
# pipe.multi()
pipe.exec()
"
This error will occur:
‘’‘
pipe.exec()
…
-
I have a very large crawl project, and breadth-first meant I had to wait a very long time to get my first item (they are 2 or 3 layers down from the start url).
A quick change of Queue.py line 33…
-
I test scrapely with your example...but I don't know how to store templates to file (or database)...
I tried
> > > from scrapely import Scraper
> > > s = Scraper()
> > > url1 = 'http://pypi.python.or…