Open movingheart opened 8 years ago
Has this bug fixed yet?
@denity , if you're referring to :
2017-05-18 11:25:57 [twisted] CRITICAL: Unhandled Error
Traceback (most recent call last):
File "/home/paul/.virtualenvs/scrapy-jsonrpc.py2/local/lib/python2.7/site-packages/twisted/protocols/basic.py", line 571, in dataReceived
why = self.lineReceived(line)
File "/home/paul/.virtualenvs/scrapy-jsonrpc.py2/local/lib/python2.7/site-packages/twisted/web/http.py", line 1811, in lineReceived
self.allContentReceived()
File "/home/paul/.virtualenvs/scrapy-jsonrpc.py2/local/lib/python2.7/site-packages/twisted/web/http.py", line 1906, in allContentReceived
req.requestReceived(command, path, version)
File "/home/paul/.virtualenvs/scrapy-jsonrpc.py2/local/lib/python2.7/site-packages/twisted/web/http.py", line 771, in requestReceived
self.process()
--- <exception caught here> ---
File "/home/paul/.virtualenvs/scrapy-jsonrpc.py2/local/lib/python2.7/site-packages/twisted/web/server.py", line 190, in process
self.render(resrc)
File "/home/paul/.virtualenvs/scrapy-jsonrpc.py2/local/lib/python2.7/site-packages/twisted/web/server.py", line 241, in render
body = resrc.render(self)
File "/home/paul/.virtualenvs/scrapy-jsonrpc.py2/local/lib/python2.7/site-packages/scrapy_jsonrpc/txweb.py", line 11, in render
return self.render_object(r, txrequest)
File "/home/paul/.virtualenvs/scrapy-jsonrpc.py2/local/lib/python2.7/site-packages/scrapy_jsonrpc/txweb.py", line 14, in render_object
r = self.json_encoder.encode(obj) + "\n"
File "/home/paul/.virtualenvs/scrapy-jsonrpc.py2/local/lib/python2.7/site-packages/scrapy_jsonrpc/serialize.py", line 89, in encode
return super(ScrapyJSONEncoder, self).encode(o)
File "/usr/lib/python2.7/json/encoder.py", line 207, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/usr/lib/python2.7/json/encoder.py", line 270, in iterencode
return _iterencode(o, 0)
File "/home/paul/.virtualenvs/scrapy-jsonrpc.py2/local/lib/python2.7/site-packages/scrapy_jsonrpc/serialize.py", line 109, in default
return super(ScrapyJSONEncoder, self).default(o)
File "/usr/lib/python2.7/json/encoder.py", line 184, in default
raise TypeError(repr(o) + " is not JSON serializable")
exceptions.TypeError: <scrapy.crawler.Crawler object at 0x7f14cac75dd0> is not JSON serializable
when accessing http://localhost:<webserviceport>/crawler
,
then I believe it's not a valid bug.
With Python 2.7, scrapy 1.3.3 and scrapy-jsonrpc and a simple spider like this:
# -*- coding: utf-8 -*-
import scrapy
class ExampleSpider(scrapy.Spider):
name = "example"
allowed_domains = ["example.com"]
def start_requests(self):
for i in range(0, 1000):
yield scrapy.Request('http://httpbin.org/get?q=%d' % i)
def parse(self, response):
pass
I also get that error when accessing the webservice endpoint in my browser.
But this is not the intended way to interact with this RPC extension.
User should use it in similar way to what example-client.py
does.
Example usage: (note: the warnings below should be addressed with https://github.com/scrapy-plugins/scrapy-jsonrpc/pull/11)
$ python example-client.py -H localhost -P 6025 list-running
/home/paul/src/scrapy-jsonrpc/scrapy_jsonrpc/serialize.py:8: ScrapyDeprecationWarning: Module `scrapy.spider` is deprecated, use `scrapy.spiders` instead
from scrapy.spider import Spider
spider:7f9fe4276890:example
This internal does an HTTP GET on /crawler/engine/open_spiders
GET /crawler/engine/open_spiders HTTP/1.1
Accept-Encoding: identity
Host: localhost:6025
Connection: close
User-Agent: Python-urllib/2.7
HTTP/1.1 200 OK
Content-Length: 32
Access-Control-Allow-Headers: X-Requested-With
Server: TwistedWeb/17.1.0
Connection: close
Date: Thu, 18 May 2017 09:34:49 GMT
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET, POST, PATCH, PUT, DELETE
Content-Type: application/json
["spider:7f9fe4276890:example"]
In other words, the /crawler
resource is not usable directly (at least with GET in a browser).
Although the example client has bugs too. Stats for example are available at /crawler/stats
, not /stats
:
list-available
does a POST on /crawler/spiders
:
$ python example-client.py -H localhost -P 6025 list-available
/home/paul/src/scrapy-jsonrpc/scrapy_jsonrpc/serialize.py:8: ScrapyDeprecationWarning: Module `scrapy.spider` is deprecated, use `scrapy.spiders` instead
from scrapy.spider import Spider
/home/paul/src/scrapy-jsonrpc/scrapy_jsonrpc/jsonrpc.py:40: ScrapyDeprecationWarning: Call to deprecated function unicode_to_str. Use scrapy.utils.python.to_bytes instead.
data = unicode_to_str(json.dumps(req))
example
POST /crawler/spiders HTTP/1.1
Accept-Encoding: identity
Content-Length: 59
Host: localhost:6025
Content-Type: application/x-www-form-urlencoded
Connection: close
User-Agent: Python-urllib/2.7
{"params": {}, "jsonrpc": "2.0", "method": "list", "id": 1}
HTTP/1.1 200 OK
Content-Length: 51
Access-Control-Allow-Headers: X-Requested-With
Server: TwistedWeb/17.1.0
Connection: close
Date: Thu, 18 May 2017 09:37:16 GMT
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET, POST, PATCH, PUT, DELETE
Content-Type: application/json
{"jsonrpc": "2.0", "result": ["example"], "id": 1}
get-global-stats
does another POST:
$ python example-client.py -H localhost -P 6025 get-global-stats
/home/paul/src/scrapy-jsonrpc/scrapy_jsonrpc/serialize.py:8: ScrapyDeprecationWarning: Module `scrapy.spider` is deprecated, use `scrapy.spiders` instead
from scrapy.spider import Spider
/home/paul/src/scrapy-jsonrpc/scrapy_jsonrpc/jsonrpc.py:40: ScrapyDeprecationWarning: Call to deprecated function unicode_to_str. Use scrapy.utils.python.to_bytes instead.
data = unicode_to_str(json.dumps(req))
log_count/DEBUG 115
scheduler/dequeued 113
log_count/INFO 12
downloader/response_count 113
downloader/response_status_count/200 113
log_count/WARNING 4
scheduler/enqueued/memory 113
downloader/response_bytes 72569
start_time 2017-05-18 09:32:18
scheduler/dequeued/memory 113
scheduler/enqueued 113
downloader/request_bytes 24743
response_received_count 113
downloader/request_method_count/GET 114
downloader/request_count 114
POST /crawler/stats HTTP/1.1
Accept-Encoding: identity
Content-Length: 64
Host: localhost:6025
Content-Type: application/x-www-form-urlencoded
Connection: close
User-Agent: Python-urllib/2.7
{"params": {}, "jsonrpc": "2.0", "method": "get_stats", "id": 1}
HTTP/1.1 200 OK
Content-Length: 528
Access-Control-Allow-Headers: X-Requested-With
Server: TwistedWeb/17.1.0
Connection: close
Date: Thu, 18 May 2017 09:38:54 GMT
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET, POST, PATCH, PUT, DELETE
Content-Type: application/json
{"jsonrpc": "2.0", "result": {"log_count/DEBUG": 115, "scheduler/dequeued": 113, "log_count/INFO": 12, "downloader/response_count": 113, "downloader/response_status_count/200": 113, "log_count/WARNING": 4, "scheduler/enqueued/memory": 113, "downloader/response_bytes": 72569, "start_time": "2017-05-18 09:32:18", "scheduler/dequeued/memory": 113, "scheduler/enqueued": 113, "downloader/request_bytes": 24743, "response_received_count": 113, "downloader/request_method_count/GET": 114, "downloader/request_count": 114}, "id": 1}
Some suggestions: