jschnurr / scrapyscript

Run a Scrapy spider programmatically from a script or a Celery task - no project required.
MIT License
121 stars 26 forks source link

README example blows up #2

Closed exarkun closed 7 years ago

exarkun commented 7 years ago
$ pip freeze
appdirs==1.4.1
attrs==16.3.0
Automat==0.5.0
billiard==3.3.0.23
cffi==1.9.1
constantly==15.1.0
cryptography==1.7.2
cssselect==1.0.1
enum34==1.1.6
idna==2.2
incremental==16.10.1
ipaddress==1.0.18
lxml==3.7.2
packaging==16.8
parsel==1.1.0
pkg-resources==0.0.0
pyasn1==0.1.9
pyasn1-modules==0.0.8
pycparser==2.17
PyDispatcher==2.0.5
pyOpenSSL==16.2.0
pyparsing==2.1.10
queuelib==1.4.2
Scrapy==1.3.0
scrapyscript==0.0.6
service-identity==16.0.0
six==1.10.0
Twisted==17.1.0
w3lib==1.16.0
zope.interface==4.3.3
$  cat testit.py
from scrapyscript import Job, Processor
from scrapy.spiders import Spider

class PythonSpider(Spider):
    name = 'myspider'
    start_urls = ['http://www.python.org']

    def parse(self, response):
        title = response.xpath('//title/text()').extract()
        return {'title': title}

job = Job(PythonSpider())
Processor().run(job)
$ python testit.py
...
2017-02-23 08:02:44 [scrapy.core.scraper] ERROR: Error downloading <GET http://www.python.org>
Traceback (most recent call last):
  File "/home/exarkun/Environments/scrapy/local/lib/python2.7/site-packages/twisted/internet/defer.py", line 1299, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/home/exarkun/Environments/scrapy/local/lib/python2.7/site-packages/twisted/python/failure.py", line 393, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/home/exarkun/Environments/scrapy/local/lib/python2.7/site-packages/scrapy/core/downloader/middleware.py", line 43, in process_request
    defer.returnValue((yield download_func(request=request,spider=spider)))
  File "/home/exarkun/Environments/scrapy/local/lib/python2.7/site-packages/scrapy/utils/defer.py", line 45, in mustbe_deferred
    result = f(*args, **kw)
  File "/home/exarkun/Environments/scrapy/local/lib/python2.7/site-packages/scrapy/core/downloader/handlers/__init__.py", line 65, in download_request
    return handler.download_request(request, spider)
  File "/home/exarkun/Environments/scrapy/local/lib/python2.7/site-packages/scrapy/core/downloader/handlers/http11.py", line 61, in download_request
    return agent.download_request(request)
  File "/home/exarkun/Environments/scrapy/local/lib/python2.7/site-packages/scrapy/core/downloader/handlers/http11.py", line 286, in download_request
    method, to_bytes(url, encoding='ascii'), headers, bodyproducer)
  File "/home/exarkun/Environments/scrapy/local/lib/python2.7/site-packages/twisted/web/client.py", line 1631, in request
    parsedURI.originForm)
  File "/home/exarkun/Environments/scrapy/local/lib/python2.7/site-packages/twisted/web/client.py", line 1408, in _requestWithEndpoint
    d = self._pool.getConnection(key, endpoint)
  File "/home/exarkun/Environments/scrapy/local/lib/python2.7/site-packages/twisted/web/client.py", line 1294, in getConnection
    return self._newConnection(key, endpoint)
  File "/home/exarkun/Environments/scrapy/local/lib/python2.7/site-packages/twisted/web/client.py", line 1306, in _newConnection
    return endpoint.connect(factory)
  File "/home/exarkun/Environments/scrapy/local/lib/python2.7/site-packages/twisted/internet/endpoints.py", line 788, in connect
    EndpointReceiver, self._hostText, portNumber=self._port
  File "/home/exarkun/Environments/scrapy/local/lib/python2.7/site-packages/twisted/internet/_resolver.py", line 174, in resolveHostName
    onAddress = self._simpleResolver.getHostByName(hostName)
  File "/home/exarkun/Environments/scrapy/local/lib/python2.7/site-packages/scrapy/resolver.py", line 21, in getHostByName
    d = super(CachingThreadedResolver, self).getHostByName(name, timeout)
  File "/home/exarkun/Environments/scrapy/local/lib/python2.7/site-packages/twisted/internet/base.py", line 276, in getHostByName
    timeoutDelay = sum(timeout)
TypeError: 'float' object is not iterable
...
$
exarkun commented 7 years ago

This seems to be related to Twisted 17.1.0 - the problem goes away if I downgrade to Twisted 16.6.0.

jschnurr commented 7 years ago

Fixed in 0.1.0 c869d9e.