lanto03 / couchdb-python

Automatically exported from code.google.com/p/couchdb-python
Other
0 stars 0 forks source link

Some requests are repeated due to a bug in httplib2 #85

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. I try example in this link
http://davidwatson.org/2008/02/python-couchdb-rocks.html
2. And run it

What is the expected output? What do you see instead?
3 entries in DB instead 4 entries

What version of the product are you using? On what operating system?
latest couchdb-python, CouchDB 0.9.1, Ubuntu 8.04

Please provide any additional information below.

Original issue reported on code.google.com by sjtir...@gmail.com on 17 Aug 2009 at 7:57

Attachments:

GoogleCodeExporter commented 8 years ago
I forgot something, this is the couchdb-python release CouchDB-0.6-py2.5
And the javascript map does not work as well.

Steve

Original comment by sjtir...@gmail.com on 17 Aug 2009 at 8:04

GoogleCodeExporter commented 8 years ago
The javascript map works, I mistyped the "emit" with "map"

Original comment by sjtir...@gmail.com on 17 Aug 2009 at 8:28

GoogleCodeExporter commented 8 years ago
Don't know if it's the same, but I've done a simple

couchdb.Server("http://localhost:5984/")[testdb].create({ "test": True })

on the interactive python shell and it worked as it should. One document 
created. When I did this from within a 
pylons controller I ended up with 2 documents (same content, different id of 
course).

Original comment by st.schus...@gmail.com on 18 Aug 2009 at 7:28

GoogleCodeExporter commented 8 years ago
I looked into the API coding, there are nothing suspicious.
I tried to figure out, when it happens. 
If I try to create 5 different documents, it will create the first document 
twice in
DB and all other 4 documents will be only created once. 
And this happens only on the first time creating the document in DB after I 
open the
connection to the DB. If I create another 5 documents after I created the first
documents, the later creation of 5 documents goes normal.

Original comment by sjtir...@gmail.com on 20 Aug 2009 at 8:11

GoogleCodeExporter commented 8 years ago
Here a summary of what I found out was the problem for me. As said I first 
thought it was connected to 
pylons: It was not! Regardless of using PUT or POST, regardless of pylons or 
not - sometimes creating a 
document created it twice, and when using PUT instead of POST it of course 
raised a ResourceConflict.

I even got down to analyzing the HTTP stream via Wireshark to find the issue. 
It turned out that I could fix it 
for me by changing 1 line in httplib2, but since it seems couchdb specific I 
post it here. With the couchdb-
python lib everything is fine. It only makes 1 request to httplib2. httplib2 
uses httplib and tries to be smart 
and to keep-alive and reuse the http connection it gets from httplib. The 
problem for me was that that on a 
reused http connection on line 859 in httplib get conn.getresponse() threw the 
exception ResponseNotReady 
and so tried to send the request again. The problem was, that the server 
actually received and processed 
both.

I was able to solve this for me by adding a conn.close() at line 874 (after 
successfully receiving a response) - 
and therefore not reusing the http connection. I don't know what kind of 
performance impact this might have. 
But at least this works.

I experienced this problem with couchdb 0.9.1 and python 2.4/2.5/2.6 on Mac OS, 
as well as with couchdb 
0.9.1 and python 2.6 on ubuntu 9.04. In all cases my fix worked.

I actually don't know wether this is a couchdb issue, or a python httplib, or 
httplib2 issue - I only know that 
this fix worked for me, and I thought I share it here.

Original comment by st.schus...@gmail.com on 4 Sep 2009 at 11:39

GoogleCodeExporter commented 8 years ago
Excellent detective work, st.schuster.

Maybe this is more reason to move to httplib and dump the httplib2 dependency.

Original comment by randall....@gmail.com on 5 Sep 2009 at 6:55

GoogleCodeExporter commented 8 years ago
Although I think httplib2 has issues I think it's a bit early to blame this on 
httplib2 just 
yet ;-).

If couchdb-python replaces httplib2 we're going to want to reuse HTTP 
connections too. It would 
be nice to know it's *definitely* httplib2 and not some weird interaction 
between httplib and 
couchdb. Otherwise, the same problem is likely to crop up again later.

st.schuster, do you have a script that consistently demonstrates this problem? 
If not, please 
can you provide a stack trace at the point httplib2 catches the 
ResponseNotReady exception? I'm 
guessing it's line 855 in httplib2/__init__.py, in the _conn_request method as 
that's the 
method that seems to be doing the retry. If that is the correct bit of code the 
value of 'i' 
would be useful to know too.

However, I suspect the problem is in httplib2. I believe the call to 
"conn.close()" on line 857 
of httplib2/__init__.py should be outside the if block, immediately before it. 
Otherwise a 2nd 
failure will leave an open connection in an inconsistent state.

Original comment by matt.goo...@gmail.com on 7 Sep 2009 at 1:38

GoogleCodeExporter commented 8 years ago
Also, can you maybe try whether this is also a problem on the httplib banch:

  http://couchdb-python.googlecode.com/svn/branches/experimental/httplib/

Original comment by cmlenz on 8 Sep 2009 at 8:57

GoogleCodeExporter commented 8 years ago
@matt.goodall: Yes, that's why I wrote I don't know wether this is httplib, 
httplib2 or a couchdb problem. I just know 
that this fix worked for me.

A consistent test script? Ha! For me this problem is nearly default behaviour, 
which is also why I'm wondering that 
nobody else seems to have this problem. Here a short sample script I've used 
which practically always fails:

import couchdb, time
server = couchdb.Server("http://localhost:5984")
db = server["test"]
for i in range(10):
    t = str(time.time())
    print "Creating " + t
    db[t] = {}
    time.sleep(0.5)

Regarding ResponseNotReady: Depending on the line count of my httplib2 file 855 
is not right. It's in 859 which is 
for me:

try:
    response = conn.getresponse()

This try block actually catches httplib.HTTPException so I've added a more 
specific exception handler above it:

except httplib.ResponseNotReady:
    print "Value of i: " + str(i)
    raise

This is the stack trace:

sschuster$ python couchdbtest.py 
Creating 1252411032.46
Value of i: 0
Traceback (most recent call last):
  File "couchdbtest.py", line 9, in <module>
    db[t] = {}
  File "build/bdist.macosx-10.5-i386/egg/couchdb/client.py", line 323, in __setitem__
  File "build/bdist.macosx-10.5-i386/egg/couchdb/client.py", line 985, in put
  File "build/bdist.macosx-10.5-i386/egg/couchdb/client.py", line 1010, in _request
  File "build/bdist.macosx-10.5-i386/egg/couchdb/client.py", line 1005, in _make_request
  File "/Users/sschuster/devel/spaaze/pylonsEnv/lib/python2.5/site-packages/httplib2/__init__.py", line 1105, in 
request
    (response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey)
  File "/Users/sschuster/devel/spaaze/pylonsEnv/lib/python2.5/site-packages/httplib2/__init__.py", line 891, in 
_request
    (response, content) = self._conn_request(conn, request_uri, method, body, headers)
  File "/Users/sschuster/devel/spaaze/pylonsEnv/lib/python2.5/site-packages/httplib2/__init__.py", line 859, in 
_conn_request
    response = conn.getresponse()
  File "/opt/local/lib/python2.5/httplib.py", line 918, in getresponse
    raise ResponseNotReady()
httplib.ResponseNotReady

So the value of i is always 0 - the second time (after the connection was 
closed and reopened, which initially was the 
clue which made me come up with my "fix") it works.

What confuses me is that I sometimes also get messages like this "1> [error] 
[<0.732.0>] Uncaught error in HTTP 
request: {exit,normal}" in the couchdb log. But only on Mac OS. I would have 
guessed that this is because of the 
suddenly terminated client connection - but who knows.

Regarding connection re-use. In the example output it already failed at the 
first request. Nevertheless it's due to 
connection reuse, since couchdb-python always makes a HEAD call on the 
database. What are these for anyways?

@cmlenz: Trying httplib branch in a minute

Original comment by st.schus...@gmail.com on 8 Sep 2009 at 12:24

GoogleCodeExporter commented 8 years ago
@cmlenz: Running the same test script with the httplib branch actually showed 
no errors at all. Cool! But this 
once agains encourages the suspicion that httplib2 was the original culprit.

Original comment by st.schus...@gmail.com on 8 Sep 2009 at 12:38

GoogleCodeExporter commented 8 years ago
OK, I think I know what's going on now.

httplib2 0.5.0 no longer reads the response body for a HEAD request, 
http://code.google.com/p/httplib2/issues/detail?id=56.

The change was to get around a problem with reading "transfer-coding: chunked" 
HEAD responses but, 
unfortunately, it leaves the connection in a state that it cannot be reused. (I 
would have thought 
they'd spot sooner that but I guess the retry code is covering up the problem 
in a non-visible way 
for most people.)

I'll submit an error report to the httplib2 issue tracker but perhaps you could 
check the fix I 
intend to include too? Change httplib/__init__.py, somewhere around line 869 
from:

                if method != "HEAD":
                    content = response.read()

to:

                if method == "HEAD":
                    response.close()
                else:
                    content = response.read()

A workaround is to create a database directly instead of via a couchdb.Server() 
instance:

   db = couchdb.Database('http://localhost:5984/test')

That skips the HEAD request to check the database exists and so doesn't trigger 
the error.

Original comment by matt.goo...@gmail.com on 8 Sep 2009 at 3:04

GoogleCodeExporter commented 8 years ago
Yes this fix works for me, and suddenly all makes sense. Stupid me to overlook 
the the special HEAD handling 
right at the place where I did my workaround.

Regarding the database constructor. Nice, I've overseen this as well. Could be 
part of one of the examples in the 
documentation. Also it means that I have to create the http object myself to 
set a timeout.

But now everything works. Thanks for all the help.

Original comment by st.schus...@gmail.com on 8 Sep 2009 at 6:57

GoogleCodeExporter commented 8 years ago
See http://code.google.com/p/httplib2/issues/detail?id=67 for httplib2 bug 
report.

Original comment by matt.goo...@gmail.com on 8 Sep 2009 at 7:05

GoogleCodeExporter commented 8 years ago
Issue 89 has been merged into this issue.

Original comment by matt.goo...@gmail.com on 15 Sep 2009 at 9:56

GoogleCodeExporter commented 8 years ago
Issue 93 has been merged into this issue.

Original comment by matt.goo...@gmail.com on 2 Oct 2009 at 3:41

GoogleCodeExporter commented 8 years ago

Original comment by randall....@gmail.com on 27 Oct 2009 at 10:26

GoogleCodeExporter commented 8 years ago
Issue 95 has been merged into this issue.

Original comment by randall....@gmail.com on 27 Oct 2009 at 10:27

GoogleCodeExporter commented 8 years ago
Issue 97 has been merged into this issue.

Original comment by randall....@gmail.com on 27 Oct 2009 at 10:28

GoogleCodeExporter commented 8 years ago
Looks like my patch for httplib2 was applied earlier today. Not had chance to 
test 
yet.

Original comment by matt.goo...@gmail.com on 16 Nov 2009 at 2:17

GoogleCodeExporter commented 8 years ago
[deleted comment]
GoogleCodeExporter commented 8 years ago
I tried out httplib2 head. Looks fixed to me.

Original comment by dnolen.l...@gmail.com on 16 Nov 2009 at 2:40

GoogleCodeExporter commented 8 years ago
Marking this fixed. Thanks, guys!

Original comment by djc.ochtman on 14 Dec 2009 at 11:05

GoogleCodeExporter commented 8 years ago
Issue 108 has been merged into this issue.

Original comment by djc.ochtman on 15 Dec 2009 at 7:21