Closed mayfield closed 8 years ago
We can update Socket.send()
to check for bytes
as well, if so we probably need to update the dependencies so we always pull in cffi-1.8+.
Exceptions for control flow aren't the cleanest code, but if you're able to prove your case with a benchmark we can think about it.
I'll throw a little something together. I played with this more last night and there is nearly 20% perf boost even for 1k messages on python3.5.
Okay. Note that it has to work on Python 2.7 for this to be merged.
Here are some simple benchmarks (not terribly scientific)..
Using this test code:
import time
import nnpy
import sys
PACKETS = 50000
MSG = b'\xff' * 1024
def echo(req, reply):
req.send(MSG)
got = reply.recv()
assert got == MSG, got
reply.send(got)
got = req.recv()
assert got == MSG, got
def setup(addr):
req = nnpy.Socket(nnpy.AF_SP, nnpy.REQ)
req.bind(addr)
reply = nnpy.Socket(nnpy.AF_SP, nnpy.REP)
reply.connect(addr)
return req, reply
def test(req, reply):
for i in range(PACKETS):
echo(req, reply)
for addr in 'tcp://127.0.0.1:6423', 'inproc://foo', 'ipc://bar':
req, reply = setup(addr)
time.sleep(0.500)
start = time.time()
try:
test(req, reply)
except Exception as e:
print("%20s, %25s,,,FAILED,%s" % (sys.argv[1], addr, e))
else:
took = time.time() - start
pps = PACKETS / took
print("%20s, %25s, %f, %d" % (sys.argv[1], addr, took, round(pps)))
And these two patches
instance_check
patch
diff --git a/nnpy/socket.py b/nnpy/socket.py
index 2ce24d3..37bec69 100644
--- a/nnpy/socket.py
+++ b/nnpy/socket.py
@@ -67,7 +67,7 @@ class Socket(object):
return errors.convert(rc, rc)
def send(self, data, flags=0):
- if isinstance(data, memoryview):
+ if isinstance(data, (memoryview, bytes)):
buf = ffi.from_buffer(data)
else:
data = data.encode() if isinstance(data, ustr) else data
try_from_buffer_first
patch
diff --git a/nnpy/socket.py b/nnpy/socket.py
index 2ce24d3..10d4d72 100644
--- a/nnpy/socket.py
+++ b/nnpy/socket.py
@@ -67,9 +67,9 @@ class Socket(object):
return errors.convert(rc, rc)
def send(self, data, flags=0):
- if isinstance(data, memoryview):
+ try:
buf = ffi.from_buffer(data)
- else:
+ except TypeError:
data = data.encode() if isinstance(data, ustr) else data
buf = ffi.new('char[%i]' % len(data), data)
rc = nanomsg.nn_send(self.sock, buf, len(data), flags)
Okay, I'm happy to take your try_from_buffer_first
patch. Can you add a little comment? It should probably mention (a) that it's there for performance reasons, (b) that the path taken depends on the cffi version.
You bet. PR incoming.
This FAQ entry is worth considering https://docs.python.org/3/faq/design.html#how-fast-are-exceptions.
Wouldn't it better to allow Socket.send
to accept bytes/buffers only? IMHO, the method gets more and more complex and is not symmetric with Socket.recv
. For example in pyzmq
, there methods like Socket.send_string/Socket.recv_string
.
Keep it simple and avoid explaining various nuances of Socket.send
in documentation :)
To some extent I feel that 5269d26 made it less complex, actually.
Do you have a concrete problem with the new behavior?
I would just change the method from
def send(self, data, flags=0):
"""
... docs plus explanation when zero-copy is performed and for which types of data the method is efficient...
"""
# Some data types can use a zero-copy buffer creation strategy when
# paired with new versions of CFFI. Namely, CFFI 1.8 supports `bytes`
# types with `from_buffer`, which is about 18% faster. We try the fast
# way first and degrade as needed for the platform.
try:
buf = ffi.from_buffer(data)
except TypeError:
data = data.encode() if isinstance(data, ustr) else data
buf = ffi.new('char[%i]' % len(data), data)
rc = nanomsg.nn_send(self.sock, buf, len(data), flags)
return errors.convert(rc, rc)
to
def send(self, data, flags=0):
"""
Send data to the socket.
The `data` is buffer (bytes, bytearray) object.
:param data: Data to be sent.
.. seealso:: `ffi.from_buffer <http://cffi.readthedocs.io/en/latest/ref.html#ffi-buffer-ffi-from-buffer>`_
.. seealso:: `nn_send <http://nanomsg.org/v1.0.0/nn_send.3.html>`_
"""
buf = ffi.from_buffer(data)
rc = nanomsg.nn_send(self.sock, buf, len(data), flags)
return errors.convert(rc, rc)
That just makes it harder to use. IMO supporting strings is pretty useful, and doing so the way it is now makes sense in a language where "duck typing" is a best practice. Now, if you want to improve the docstring, I'd be happy to accept a pull request for that.
(Also, you have not stated a concrete problem with the new behavior.)
According to linked FAQ entry above, catching an exception is expensive. It would be nice to make users aware about this and put into documentation of Socket.send
- strings are supported, but your calls are slowed down because the conversion is handled via try/except
clause. Knowing that, I will convert strings by myself, which makes Socket.send
supporting strings not so useful anymore.
I would simply keep it simple and allow users to decide how to perform conversion (and there are plenty of ways including single dispatch :)).
Expensive is a relative term. I prefer to make the library work for people who are just getting started with the simplest thing possible, which I guess does not fit your understanding of "simple", but sure is important for mine. On the other hand, people who need to squeeze out all the performance can do a bit more work to find out where the bottlenecks are, and if this particular method pops up they can check out the documentation and/or comments to figure out that there are faster ways to use it.
With respect to
Socket.send()
...In cffi 1.8 you can use ffi.from_buffer on bytes types which provides a nice perf boost over ffi.new() when the data is large. (30% in my tests for a ~1MB buffer).
It also might also be more efficient to avoid the type checking and just assume a from_buffer type will be used (fast path) and do any conversion to a bytes type in an exception handler (slow path).
I've not tested on python2 nor do I understand the ramifications of such a change to that platform.