Row object repr broken for non existed documents

GoogleCodeExporter commented 9 years ago

What steps will reproduce the problem?
1. try to get nonexistent document by keys argument via view 
2. write code:
dockeys=['no-such-document']
for row in db.view('_all_docs', keys=dockeys):
  print row

What is the expected output? What do you see instead?
Expected:
<Row key=u'no-such-document', error=u'not_found'>

Instead exception raised:
Traceback (most recent call last):
  File "R:\projects\Coderis\src\coderis\__init__.py", line 34, in <module>
    print row
  File "R:\bin\python\2.6\lib\site-packages\couchdb\client.py", line 1035, in __repr__
    self.value)
  File "R:\bin\python\2.6\lib\site-packages\couchdb\client.py", line 1054, in value
    return self['value']
KeyError: 'value'

What version of the product are you using? On what operating system?
python 2.6.5
couchdb 0.9/0.10/0.11/1.0
couchdb-python 0.7.1@7f84eb87c031

Please provide any additional information below.
Patch attached. I'm not sure about __missing__ method, but it makes behavior 
similar for getattr/getitem calls.

Original issue reported on code.google.com by kxepal on 26 Jul 2010 at 8:21

Attachments:

375.patch

GoogleCodeExporter commented 9 years ago

Thanks for the report & patch.

I'm also unsure about using __missing__ as it changes the behaviour of Row for 
those who like to use Row as if it's a plain Python dict. It also unnecessarily 
forces a Python 2.5 dependency.

If you get the chance, please could you rework the patch so it doesn't use 
__missing__. If not, I'll try to take a look before long.

Original comment by matt.goo...@gmail.com on 2 Aug 2010 at 4:45

Changed title: Row object repr broken for non existed documents

GoogleCodeExporter commented 9 years ago

Ok, fixed it. I forgot about 2.4(:

Still same behavior for row['error'] and row.error if even `error` had not been 
occurred, but explicit declaration of "safe" attributed...something I dont like 
in this solution.

Original comment by kxepal on 2 Aug 2010 at 6:25

Attachments:

375.patch

GoogleCodeExporter commented 9 years ago

Is the __getitem__ method here necessary to the rest of the patch? It's not 
very clear.

Original comment by djc.ochtman on 7 Aug 2010 at 2:36

GoogleCodeExporter commented 9 years ago

If we want always have same behavior in all cases for row['error'] and 
row.error, row['id'] and row.id.

Ok, let's remove __getitem__ and use .get() within properties, but code:
for row in db.view(...):
  print row.error

will be not same as:
for row in db.view(...):
  print row['error']

If there was no error, first case will return None and last case will raise 
KeyError which is not expected.

Actually, we have only one stable keyword - `key`, others may exists and may be 
not and our properties are always available to get while keywords are not.

Original comment by kxepal on 7 Aug 2010 at 2:52

GoogleCodeExporter commented 9 years ago

Okay. IMO it would be nice if you could split your current patch out into a 
refactoring part (maybe even 2) and the actual bugfix (insofar it doesn't fall 
out of refactoring for free).

Original comment by djc.ochtman on 7 Aug 2010 at 2:57

GoogleCodeExporter commented 9 years ago

no problem.
row.fix.patch - solve only repr error.
row.imp.patch - add support for error property and makes same behavior for 
properties and keywords in all cases.

Original comment by kxepal on 7 Aug 2010 at 4:27

Attachments:

GoogleCodeExporter commented 9 years ago

What the status of this issue? Do I need to do something else in this patch?

Original comment by kxepal on 16 Aug 2010 at 2:21

GoogleCodeExporter commented 9 years ago

First, thanks for splitting out the patches, this makes it somewhat clearer.

I'm still thinking they're not exactly what I'd want, I think. This issue is 
about the __repr__() being broken, but you introduce (in your second patch) a 
__getitem__ method to help fix the __repr__(). It seems to me that we wouldn't 
need the __getitem__ here: I'd rather have outside users break instead of 
silently getting None if they're trying to retrieve a key it doesn't have. 
Also, the __repr__() code doesn't really seem to need the __getitem__(), it 
could implemented just as well without it, I think.

Original comment by djc.ochtman on 16 Aug 2010 at 2:49

GoogleCodeExporter commented 9 years ago

No-no-no. __getitem__ is for improvement, fix doesn't require it. 

__getitem__ propose is only to make same behavior for __getattr__ and 
__getattr__ calls, because currently they are different - see 
row.fix.tests.patch:
self.assertRaises(KeyError, lambda: row['id'])
while:
self.assertTrue(row.id is None)

this is unexpected behavior and trick with __getitem__ here to fix it.

> I'd rather have outside users break instead of silently getting None if 
they're
> trying to retrieve a key it doesn't have. Also, the __repr__() code doesn't 
really
> seem to need the __getitem__(), it could implemented just as well without it, 
> I think.

I've thought about it, but it forces to write such code. 
for row in db.view(...):
  try:
    ...
  except KeyError:
    ...

id is missing for reduced and if error occured
doc is missing if not include_docs passed
value is missing if error occurred, but value may be not returned in result
error is missing is all was fine

so each view result processing becomes as walking through minefield. 

Moreover, IMHO, raising exception on property getting is very unexpected 
action, because without looking at sources, I believe that I just getting 
attribute value, which has some value - None by default and None I've expected 
in worst case.

Original comment by kxepal on 16 Aug 2010 at 3:31

GoogleCodeExporter commented 9 years ago

What the status of this issue?

Original comment by kxepal on 13 Sep 2010 at 12:14

GoogleCodeExporter commented 9 years ago

[deleted comment]

GoogleCodeExporter commented 9 years ago

I'm also curious about the status of this issue, but thought I'd chime in - I 
like the fix patch, but I don't like the overriding of __getitem__ as an 
"improvement".  This is completely inconsistent with the standard python dict, 
which raises an exception when a key that does not exist is accessed via square 
brackets.  The get() function instead returns a provided default.  I do feel 
that the attributes should return None rather than error out, but that is 
already the case.

I believe that the best solution would be to not return rows for view keys that 
are requested and do not exist, like other views.  However, since couchdb makes 
_all_docs special, perhaps the best course of action would be to simply fix the 
__repr__ bug (kxepal's row.fix.patch and row.fix.tests.patch), and note in the 
documentation for view() that _all_docs is special, and returns rows with an 
"error" entry and without an "id" entry for documents requested that do not 
exist.

Original comment by woodswal...@gmail.com on 4 Oct 2010 at 10:31

GoogleCodeExporter commented 9 years ago

Just been trying to catch up on this issue. I think the Row class makes the 
decision quite difficult.

To summarise the points so far:

* __repr__ can be fixed without any other changes.
* If each row was just a plain dict then we'd all expect a KeyError.
* We don't expect an error from a Row instance with visible attributes, i.e. 
something that's in dir(row).
* Row's API is inconsistent - row.id and row.doc always return None but row.key 
and row.value may raise a KeyError.
* Row is missing an 'error' attribute, as returned by _all_docs.

I think we should just go ahead and fix __repr__ in the simplest way possible. 
Why not just include the repr of the underlying dict?

We then need to decide how to fix the rest of Row's API.

Personally I don't like the KeyError, but I probably prefer that to returning 
None for something that doesn't exist. Especially when None is a value that may 
appear in the JSON from CouchDB, e.g. a value, the key for a fully reduced 
view, etc.

One *possible* solution is to add instance attributes to the Row instance in 
__init__, but only if the dict has those items. For example:

    class Row(dict):
        """Document Row and the possible keys/attributes"""
        def __init__(self, row):
            dict.__init__(self, row)
            if 'id' in row:
                self.id = row['id']
            # etc

I don't really like that either ;-) but at least it matches the response from 
CouchDB and raises the expected exceptions if used incorrectly.

Original comment by matt.goo...@gmail.com on 5 Oct 2010 at 9:38

GoogleCodeExporter commented 9 years ago

@woodswalben I really don't like this overriding too, but is there better way? 

I see nice only one to replace dict to namedtuple with additional __getitem__ 
method which will proxying __getattr__ call. Row result is fixed and doesn't 
required any dynamic changes of it.

Original comment by kxepal on 6 Oct 2010 at 4:53

GoogleCodeExporter commented 9 years ago

@kxepal - What I meant was that leaving them alone and raising the exception is 
a more python-standard solution that overriding keyed items to return None.  
Attributes, on the other hand, should not throw an exception when accessed, 
since they should be more or less guaranteed to exist.  However, your fix patch 
guarantees that.

@matt.goodall - I don't like the idea of properties or attributes that may or 
may not exist depending on the couchdb response...  I know that you can query 
them through dir(), but these are attributes that are normally expected I 
think.  What situations can cause a row error?  Is it just _all_docs not 
finding a requested key?

Original comment by woodswal...@gmail.com on 6 Oct 2010 at 8:30

GoogleCodeExporter commented 9 years ago

@woodswalben Yeah, I think I now agree about the attributes point actually.

I've been trying to come up with scenarios where it matters if an attribute 
returns a None value when there's no data and I think there's only one: an 
_all_docs 'error' row.

Using Row attributes, an _all_docs 'error' row could be misinterpreted as being 
a row with key and value of None - something that is legitimate for a row in a 
view other than _all_docs.

That seems quite unlikely to occur so returning None for missing attributes 
does seem like the right compromise.

Original comment by matt.goo...@gmail.com on 6 Oct 2010 at 9:29

GoogleCodeExporter commented 9 years ago

@matt.goodall - Alright; and I think that anyone looking explicitly for docs in 
_all_docs that don't exist would check for "Row.doc is None" or "Row.error is 
not None".  Could always note that somewhere in the docs of course.

Original comment by woodswal...@gmail.com on 7 Oct 2010 at 3:52

GoogleCodeExporter commented 9 years ago

@woodswalben, @matt.goodall, not only _all_docs, but for all views calls.

Original comment by kxepal on 7 Oct 2010 at 4:05

GoogleCodeExporter commented 9 years ago

@kxepal - You're going to have to be more specific what you mean... Are you 
saying that all view calls can return rows without data, and with errors 
instead?

Original comment by woodswal...@gmail.com on 7 Oct 2010 at 4:08

GoogleCodeExporter commented 9 years ago

I'm sorry - I was wrong. User views returns nothing for non existing keys.

Original comment by kxepal on 7 Oct 2010 at 7:33

GoogleCodeExporter commented 9 years ago

I think I fixed these issues along the lines of consensus here, in 
r040d3403f867 and ra1c96a9a8dfd.

Original comment by djc.ochtman on 22 Dec 2010 at 10:06

Changed state: Fixed

jur9526 / couchdb-python

Row object repr broken for non existed documents #143