Closed GoogleCodeExporter closed 8 years ago
The operating system is Ubuntu 10.04.3 LTS, Linux 2.6.32-33-generic #70-Ubuntu
SMP Thu Jul 7 21:09:46 UTC 2011 i686 GNU/Linux
Original comment by daniel.g...@wavilon.com
on 4 Aug 2011 at 10:08
Correction: I realized that couchdb-python is not using httplib2, but httplib.
httplib is integrated into python: in my system I have python 2.6.5.
Original comment by daniel.g...@wavilon.com
on 4 Aug 2011 at 11:18
I have tried couchdb-python-curl (1.0.14p2,
http://code.google.com/p/couchdb-python-curl) and it is solving my problems. I
get with it 150 documents/s.
couchdb-python-curl is a fork of couchdb-python (somewhat buggy, I had to
correct a couple of small errors), but using pycurl instead of httplib seems to
increase performance by at least an order of magnitude.
Original comment by daniel.g...@wavilon.com
on 4 Aug 2011 at 11:23
Confirmed on Gentoo Linux 3.0.0 against CouchDB 1.1.0 release using
couchdb-python from tip.
Python interpreters:
python-2.4.5 (simplejson with C ext)
python-2.7.2
pypy-1.5 (jit)
All of them have showed same result 12 doc per second for saving 1K simple
documents.
Test script, cProfile stats attached and report are attached.
Looks like we're holded somewhere within _socket module. PyCURL based on
another C library that is optimized for HTTP protocol, that's why result could
be different.
But instead of doing such benchmarks, better to use specific API that is
suitable for modifying a lot of documents at once:
http://wiki.apache.org/couchdb/HTTP_Bulk_Document_API#Modify_Multiple_Documents_
With_a_Single_Request
Original comment by kxepal
on 5 Aug 2011 at 5:39
Attachments:
Regarding bulk inserts: my application is creating documents on the fly, based
on external events. I have no control on how fast those events are happening.
They could be coming with a rate of 100 events/s, or with a rate of 1
event/minute.
Using bulk inserts, I would gather these documents in a list, and when a
certain threshold is reached, I would send them as a Bulk request to couchdb.
The easy solution is to implement the threshold based on a number of documents.
The problem is that, in this approach, slow events will pile up in my list, and
will be sent to couchdb much later. The latency of my application will be very
big; indeed, the latency is unbounded, since nobody can guarantee that the
event which helps us to reach the threshold is generated.
The best solution would be to implement a combination of time-based and
quantity based threshold, say each 1s, maximum 100 documents. I am certain that
this aproach would solve my problems, and probably even increase the maximum
throughput over the 150 docs/s that I am reaching with couchdb-python-curl.
But suddently, a very simple application has gotten much more complicated,
where I have to fire timers, and implement a threshold algorithm somehow
difficult. It is probably the right way to go, but makes simple applications
suffer unnecessarily of a very small throughput, compared to other couchdb
libraries out there.
Original comment by daniel.g...@wavilon.com
on 5 Aug 2011 at 7:35
I have implemented a bulk insert using a threshold based on a number of
documents (100), and these are my new metrics:
100000 entries, 63.926536 seconds, 1564.295614 entries/s
Which is over 2 orders of magnitude of improvement compared to my original
implementation.
I still have to solve the latency problem with timers, but the improvement is
impressive!
Original comment by daniel.g...@wavilon.com
on 5 Aug 2011 at 8:04
Yes, I see your problem well. 12dps is too low amount of data, so I'd like to
investigate why that's happened. As start point I've found Python issue about
same thing: http://bugs.python.org/issue3766
Original comment by kxepal
on 5 Aug 2011 at 8:07
Ok, adding after conn.connect() line
>>> conn.sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
improved speed to 23 docs per second for me. Twice faster, but not so
overwhelming(:
Original comment by kxepal
on 5 Aug 2011 at 9:01
I have implemented the solution with the threshold combination timer /
nr_documents, and it is working fine. I am getting up to 5000 docs/s, depending
on how big the buffering of documents is made. I have experienced that the best
throughput is reached when buffering around 1000 documents, and further
buffering just flattens the curve.
Your milleage may vary, probably depending on the size of the documents that
you are using.
I have set the timeout at around 0.5s, so that I have a reasonably low latency.
I think I can live with that, specially big such a great throughput.
Original comment by daniel.g...@wavilon.com
on 5 Aug 2011 at 10:57
@daniel
You're talking about curl solution? Just to make things clear.
Original comment by kxepal
on 5 Aug 2011 at 11:02
No, I have reverted to using couchdb-python. Now, my "create" routine buffers
the documents until the threashold is reached, or the timer expires. Find
attached the implementation:
Original comment by daniel.g...@wavilon.com
on 5 Aug 2011 at 11:07
Attachments:
Hmmm, great results!
By using pure sockets I've got rate 530 docs per sec, but as far as I would
add(because I should do it) http-specific functions and checks this rate will
get lower and lower. Finally, this experiment will produce yet another httplib
with very uncertain prospects.
I suppose all that could be done there on couchdb-python side is adding
socket.TCP_NODELAY option to improve performance somehow if this wouldn't
create any problems on other OS.
Original comment by kxepal
on 5 Aug 2011 at 11:37
@kxepal: probably it is not advisable to release the changes, unless you are
very comfortable with the implementation. As I understand from your previous
comments, you are hacking a bit the lower layer of couchdb-python. Maybe it
would be easier to use another library instead of httplib? What about PyCurl,
as couchdb-python-curl is doing? I am no longer using it, since it seems to be
an inactive project, but the tests I made were improving my metrics. I must say
that it was quite buggy, with obvious errors all around the code.
As you mentioned before, the real solution is to do bulk updates, and the very
low performance of single inserts will force any user to walk the one and only
path. :)
It is nevertheless very frustrating for novices to see such an abysmal
performance of single inserts, specially compared with other tools and other
libraries. I got my early performance numbers from ab (apache benchmark) and
curl (the binary, in a loop), and both are much more performant than
couchdb-python. I do not know about the internal implementation of ab, but curl
is for sure doing single inserts, since I am just creating a new process for
each executable - and even with that overhead, it was beating couchdb-python
easily.
Original comment by daniel.g...@wavilon.com
on 5 Aug 2011 at 2:09
> probably it is not advisable to release the changes, unless you are very
comfortable with the implementation.
But it could produce nice line in change log:
- Now documents are saving twice faster!
(;
> What about PyCurl, as couchdb-python-curl is doing?
That's interesting solution and may be, at least I, will use it in very high
load project, after refactoring of couchdb.http.Session.request method first.
Currently, as for me, I've never suffered from this issue due to using bulk
updates for large amount of documents and/or task queue with worker processes
pool to handle a lot of data sources. That was just interesting to get to know
why things works as they are and what could be done to change situation.
Anyway, the final decision what to do for Matt and Dirkjan(:
Thank you for sharing your experience, @daniel!
Original comment by kxepal
on 5 Aug 2011 at 4:23
Related discussion at CouchDB user mailing list.
http://thread.gmane.org/gmane.comp.db.couchdb.user/14921/focus=14921
Original comment by kxepal
on 22 Aug 2011 at 8:00
I ran through this same issue. The problem is the Nagle algorithm, but the
correct fix is not to disable it with setsockopt(), as this may have other
consequences for the network and is slightly unportable.
The correct approach is to send HTTP headers and body in a single packet when
possible. This can be achieved with the following patch:
diff --git a/couchdb/http.py b/couchdb/http.py
--- a/couchdb/http.py
+++ b/couchdb/http.py
@@ -261,22 +261,34 @@
time.sleep(delay)
conn.close()
+ def _send_headers_and_body(body):
+ # Send the headers and body in a single packet to avoid
+ # slowdown caused by deleayd ACK and the Nagle algorithm.
+ # See issue #193.
+ if sys.version_info < (2, 7):
+ conn.endheaders()
+ conn.send(body)
+ else:
+ conn.endheaders(body)
+
def _try_request():
try:
conn.putrequest(method, path_query, skip_accept_encoding=True)
for header in headers:
conn.putheader(header, headers[header])
- conn.endheaders()
if body is not None:
if isinstance(body, str):
- conn.send(body)
+ _send_headers_and_body(body)
else: # assume a file-like object and send in chunks
+ conn.endheaders()
while 1:
chunk = body.read(CHUNK_SIZE)
if not chunk:
break
conn.send(('%x\r\n' % len(chunk)) + chunk + '\r\n')
conn.send('0\r\n\r\n')
+ else:
+ conn.endheaders()
return conn.getresponse()
except BadStatusLine, e:
# httplib raises a BadStatusLine when it cannot read the status
The message_body argument to HTTPConnection.endheaders() is undocumented, but I
believe it appeared in Python 2.7. I'll make sure it is added to httplib's
documentation.
Original comment by akhern
on 30 Sep 2011 at 7:57
Forgot to say: This approach boosted the performance on my machine by a factor
of 8, from 20 docs/sec to 160 docs/sec.
Original comment by akhern
on 30 Sep 2011 at 8:00
@akhern, nice found!
Results for me(Gentoo linux 3.0.4, CouchDB 1.1.0) using test script[1] for
10000 docs:
Python 2.7:
default options: ~22 dps
default options + patch: ~45 dps
patch + server nodelay: ~230 dps
patch + server nodelay + client nodelay: ~200 dps
Another results is for Python 2.4:
default options: ~22 dps
default options + patch: ~22 dps
patch + server nodelay: still ~22 dps
patch + server nodelay + client nodelay: ~220 dps (sic!)
PyPy shares Python 2.7 results.
[1] -
http://code.google.com/p/couchdb-python/issues/attachmentText?id=193&aid=1930004
000&name=test.py&token=30199306482030f1894eecc4e5d831d9
Original comment by kxepal
on 30 Sep 2011 at 9:01
Just for the record, the message_body argument of endheaders() is now properly
documented:
http://docs.python.org/library/httplib.html
Original comment by akhern
on 6 Oct 2011 at 3:04
@akhem thanks for the patch and nice find, I had no idea endheaders took an
optional arg in 2.7+. I've applied a slightly modified version of the patch.
Unfortunately, I forgot to attribute the commit to you - really sorry about
that!
Note: I also added a really simple performance testing script, perftest.py, to
help spot any regressions or just to get a quick overview of performance across
different platforms & version.
Original comment by matt.goo...@gmail.com
on 9 Oct 2011 at 11:52
@Matt,
Thanks for test script, I'll try to run tests against various situations now.
But, it wouldn't work with Python 2.4 due to finally statement should be alone.
Patch attached.
Original comment by kxepal
on 9 Oct 2011 at 12:09
Attachments:
Using a "slightly" improved perftest.py to add nodelay patch I've got next
results:
C:\Documents and Settings\ash\projects\couchdb-python>python perftest.py -c
10000
sys.version : '2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit
(Intel)]'
sys.platform : 'win32'
server.version : u'1.1.0'
* [create_bulk_docs_nodelay] Create lots of docs, lots at a time ... 1862.34s
(5.37s rps)
* [create_doc] Create lots of docs, one at a time ... 55.31s (180.79s rps)
* [create_doc_nodelay] Create lots of docs, one at a time with setup nodelay
... 57.88s (172.79s rps)
* [create_bulk_docs] Create lots of docs, lots at a time ... 1666.92s (6.00s
rps)
kxepal@ashdarh ~/projects/couchdb-python/default $ python2.4 perftest.py -c
10000
sys.version : '2.4.6 (#1, May 26 2011, 00:41:47) \n[GCC 4.4.5]'
sys.platform : 'linux2'
CouchDB : '1.1.0'
* [create_bulk_docs_nodelay] Create lots of docs, lots at a time ... 1404.21s
(7.12s dps)
* [create_doc] Create lots of docs, one at a time ... 445.13s (22.47s dps)
* [create_doc_nodelay] Create lots of docs, one at a time with setup nodelay
... 44.08s (226.86s dps)
* [create_bulk_docs] Create lots of docs, lots at a time ... 1385.36s (7.22s
rps)
kxepal@ashdarh ~/projects/couchdb-python/default $ pypy-c1.5 perftest.py -c
10000
sys.version : '2.7.1 (?, Aug 03 2011, 16:22:48)\n[PyPy 1.5.0-alpha0 with GCC
4.4.5]'
sys.platform : 'linux2'
server.version : u'1.1.0'
* [create_bulk_docs_nodelay] Create lots of docs, lots at a time ... 1546.17s
(6.47s rps)
* [create_doc] Create lots of docs, one at a time ... 36.96s (270.60s rps)
* [create_doc_nodelay] Create lots of docs, one at a time with setup nodelay
... 41.03s (243.72s rps)
* [create_bulk_docs] Create lots of docs, lots at a time ... 1546.21s (6.47s
rps)
kxepal@marifarai ~/couchdb-python $ python2.7 perftest.py -c 10000
sys.version : '2.7.2 (default, Sep 25 2011, 18:21:53) \n[GCC 4.5.3]'
sys.platform : 'linux2'
server.version : '1.1.0'
* [create_bulk_docs_nodelay] Create lots of docs, lots at a time ...
771.20s (12.97s rps)
* [create_doc] Create lots of docs, one at a time ... 24.80s (403.18s rps)
* [create_doc_nodelay] Create lots of docs, one at a time with setup nodelay
... 30.85s (324.12s rps)
* [create_bulk_docs] Create lots of docs, lots at a time ... 750.10s (13.33s
rps)
kxepal@marifarai ~/couchdb-python $ python2.6 perftest.py -c 10000
sys.version : '2.6.7 (r267:88850, Sep 25 2011, 23:07:39) \n[GCC 4.5.3]'
sys.platform : 'linux3'
server.version : u'1.1.0'
* [create_bulk_docs_nodelay] Create lots of docs, lots at a time ... 1426.12s
(7.01s rps)
* [create_doc] Create lots of docs, one at a time ... 427.35s (23.40s rps)
* [create_doc_nodelay] Create lots of docs, one at a time with setup nodelay
... 27.23s (367.28s rps)
* [create_bulk_docs] Create lots of docs, lots at a time ... 1550.66s (6.45s
rps)
WARNING: this perftest.py assumes that there is no socket_options defined in
CouchDB config by default. Test results may be different if it is.
Original comment by kxepal
on 9 Oct 2011 at 5:35
Attachments:
Never mind for python2.4 test result, I just forgot to fix output: first I've
run it, later I've changed output strings.
Original comment by kxepal
on 9 Oct 2011 at 5:40
@kxepal thanks for the Python 2.4 fix. Committed.
Original comment by matt.goo...@gmail.com
on 11 Oct 2011 at 9:48
Just committed a workaround that avoids Nagle's algorithm for supported Pythons
<2.7. It also removes the need for the previous 2.7-specific fix, simplifying
the real code a little.
Original comment by matt.goo...@gmail.com
on 20 Oct 2011 at 1:31
Just in case anyone comes across this issue again ... you still need to set the
nodelay option for the CouchDB server to get good performance, e.g.
[httpd]
socket_options = [{nodelay, true}]
Original comment by matt.goo...@gmail.com
on 20 Oct 2011 at 1:38
Original issue reported on code.google.com by
daniel.g...@wavilon.com
on 4 Aug 2011 at 10:06