chris-l / mock-couch

A node.js module designed to mock a CouchDB server, mostly for unit testing purposes.
http://chris-l.github.io/mock-couch/
67 stars 44 forks source link

View collation bug with array keys #47

Closed lucasyvas closed 3 years ago

lucasyvas commented 8 years ago

Edit: The first two comments are more me trying to figure out what's happening. It's good reading, but I think I've come closer to nailing down part of the problem in the third post (as well as the referenced commit to my fork just above it)

Having a bit of a problem with the order some documents are spit out versus what they do on a real couch-db instance. It's going to not look super clear, but I hope to illustrate the issue I'm having. I can try to provide sample data tomorrow if necessary. Here is the map function:

function (doc) {
          if (doc.docType === 'foo') {
            key = [];
            key.push(doc._id);
            emit(key);
          }
          if (doc.docType === 'rel:bar') {
            key = [];
            key.push(doc.fooId);
            key.push(doc.barId);
            emit(key, { _id: doc.barId });
          }
          if ((doc.docType === 'rel:baz') && (doc.path.length === 2)) {
            key = [];
            key.push(doc.fooId);
            key.push(doc.barId);
            key.push(doc.path);
            emit(key, { _id: doc.bazId });
          }
 }

What this effectively does is spit out a three level hierarchy of some documents we have.

What I want, and what seems to work very consistently on our real instance, is:

foo bar baz

foo bar baz bar baz

etc.

However, in test, foo does not come first. Therefore when we try to construct this hierarchy in flat, descending order, it fails because there was no foo before bar, or baz. It appears the foos come last. I've tried messing with the descending order flags on a view with no luck.

Like I said, it works on our real instance, but now that it doesn't work in our unit tests I'm second guessing the whole thing :/

lucasyvas commented 8 years ago

I think I may have kind of figured out the problem, but I'm still not sure where it lies. Our document ordering by key, with that view, should look like this:

[fooId] [fooId, barId] [fooId, barId, [0, bazId0]] [fooId, barId, [1, bazId1]] [fooId, barId, [1, bazId1, 0, bazIdChild0]]

But what it ends up doing is approximately:

[fooId, barId, [0, bazId0]] [fooId, barId, [1, bazId1]] [fooId, barId, [1, bazId1, 0, bazIdChild0]]

...

[fooId, barId] [fooId]

If I change our map function to be:

function(doc) {
          if (doc.docType === 'foo') {
            key = [];
            key.push(doc._id);
            key.push(''); <----------- Added here
            key.push([0, '']); <------ Added here
            emit(key);
          }
          if (doc.docType === 'rel:bar') {
            key = [];
            key.push(doc.fooId);
            key.push(doc.barId);
            key.push([0, '']); <------ Added here
            emit(key, { _id: doc.barId });
          }
          // Only want one level of baz children
          if ((doc.docType === 'rel:baz') && (doc.path.length === 2)) {
            key = [];
            key.push(doc.fooId);
            key.push(doc.barId);
            key.push(doc.path);
            emit(key, { _id: doc.bazId });
          }
 }

It 'fixes' the issue I was seeing, at least it looks like it does. This would now make the key output:

[fooId, '', [0, '']] [fooId, barId, [0, '']] [fooId, barId, [0, bazId0]] [fooId, barId, [1, bazId1]] [fooId, barId, [1, bazId1, 0, bazIdChild0]]

It seems the absence of a key element is being treated differently than using real Couch. Using the default ascending (descending=false), it seems these missing key elements should result in them being output before keys that have these elements, not after.

Being explicit about the key elements seems to have the ordering apply to the elements correctly.

Edit: Think I found the same issue in a place we cannot band-aid it. As illustrated, their may be an arbitrary number of children of which we do not know the count of when generating the key in the map function. Therefore we cannot anticipate how many dummy [0, ''] entries we need to make the comparison work properly.

lucasyvas commented 8 years ago

Sorry for the numerous posts, but as they were long, it's probably easier to read than one giant one. I attempted to fix this in my own fork with success. However, at this point, I'm not super well-versed with the specifics of how view collation is meant to work for all possible key types (I've linked the spec below). So, I currently can't be totally sure if this is just an incomplete 'fix' for the array case. If you check out the referenced commit, I was able to find that keys that are arrays are caught in the object check in the if statement, which converts them to strings and causes the problem.

It seems that arrays are meant to be compared directly and not as their string representations.

Ex.

Current behaviour (bad):

"[123]" < "[123, 1]" = false

'Fixed' behaviour (good):

[123] < [123, 1] = true

https://wiki.apache.org/couchdb/View_collation#Collation_Specification

Edit: I gave the spec a read. I feel like I could probably get this working, assuming the best place to do so is in the sort function passed to R.sortBy (or by replacing the call to R.sortBy completely - Disclaimer: I haven't read all the related code yet, so don't shoot me if I'm wrong!). I may have missed it, but the spec didn't look super clear on what to do with objects within objects. Seeing as how they're ignored when you pass search keys in, it might be safe to assume they are ignored in this case too?

By the current code, I'm not sure how many of these cases stringifying actually more or less catches. I'll look at it when I get the chance.