roxrook / concurrent-trees

Automatically exported from code.google.com/p/concurrent-trees
0 stars 0 forks source link

InvertedRadixTree.getKeyValuePairsForKeysPrefixing() returning keys that aren't even in the tree #6

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Hello, unless I am confused about what to expect, I believe 
getKeyValuePairsForKeysPrefixing() is not returning the correct information.  
In fact, I believe it is returning "keys" that aren't even in the tree.  It is 
returning the values at a node, but not the full key that was stored.

What steps will reproduce the problem?
1. Run the attached TreeTest class.

Here is the output I am getting (using the most recent jar downloaded 
yesterday):

**** Constructing new tree
Added key/value pair: /a/b/ -> 1
Added key/value pair: /a/blob/ -> 2
Added key/value pair: /a/blog/ -> 3
○
└── ○ /a/b
    ├── ○ / (1)
    └── ○ lo
        ├── ○ b/ (2)
        └── ○ g/ (3)
Keys prefixing /: {/, 1} 
Keys prefixing /a/: {/, 1} 
Keys prefixing /a/b/: {/a/b/, 1} 
Keys prefixing /a/bl/: 
Keys prefixing /a/blo/: 
Keys prefixing /a/blob/: {/a/blob/, 2} 
Keys prefixing /a/blog/: {/a/blog/, 3} 
**** Constructing new tree
Added key/value pair: /a/b -> 1
Added key/value pair: /a/blob -> 2
Added key/value pair: /a/blog -> 3
○
└── ○ /a/b (1)
    └── ○ lo
        ├── ○ b (2)
        └── ○ g (3)
Keys prefixing /: 
Keys prefixing /a: 
Keys prefixing /a/b: {/a/b, 1} 
Keys prefixing /a/bl: {/a/b, 1} 
Keys prefixing /a/blo: {/a/b, 1} 
Keys prefixing /a/blob: {/a/b, 1} {/a/blob, 2} 
Keys prefixing /a/blog: {/a/b, 1} {/a/blog, 3} 

It looks to me like the tree structure is correct, but 
getKeyVAluePairsForKeysPrefixing() is returning the incorrect key/value pairs 
for several values.  For example, with first tree in the example above:
Keys prefixing /: {/, 1}   <-  No key "/" stored; this is the node for /a/b/
Keys prefixing /a/: {/, 1}  <- Ditto; no key / was stored

I am using concurrent-trees-2.1.0.jar on Fedora 17.

Original issue reported on code.google.com by bethtir...@gmail.com on 5 Oct 2013 at 7:29

Attachments:

GoogleCodeExporter commented 9 years ago
I have implemented a fix.  The fix is in 
ConcurrentInvertedRadixTree.ConcurrentInvertedRadixTreeImpl.scanForKeysAtStartOf
Input.

Here's a code snippet of the method with the fix wrapped in comments.   Look for
  // beth tirado
for the fix

                        protected KeyValuePair<O> computeNext() {
                            outer_loop: while (charsMatched < documentLength) {
                                Node nextNode = currentNode.getOutgoingEdge(input.charAt(charsMatched));
                                if (nextNode == null) {
                                    // Next node is a dead end...
                                    //noinspection UnnecessaryLabelOnBreakStatement
                                    break outer_loop;
                                }

                                currentNode = nextNode;
                                CharSequence currentNodeEdgeCharacters = currentNode.getIncomingEdge();
                                // beth tirado:  Submitted bug http://code.google.com/p/concurrent-trees/issues/detail?id=6
                                // this is the fix.
                                if (currentNodeEdgeCharacters.length() + charsMatched > documentLength) {
                                    // this node can't be a match becasue it is too long
                                    return endOfData();
                                }
                                // beth tirado: end fix
                                int charsMatchedThisEdge = 0;
                                for (int i = 0, j = Math.min(currentNodeEdgeCharacters.length(), documentLength - charsMatched); i < j; i++) {
                                    if (currentNodeEdgeCharacters.charAt(i) != input.charAt(charsMatched + i)) {
                                        // Found a difference in chars between character in key and a character in current node.
                                        // Current node is the deepest match (inexact match)....
                                        break outer_loop;
                                    }
                                    charsMatchedThisEdge++;
                                }
                                if (charsMatchedThisEdge == currentNodeEdgeCharacters.length()) {
                                    // All characters in the current edge matched, add this number to total chars matched...
                                    charsMatched += charsMatchedThisEdge;

                                    if (currentNode.getValue() != null) {
                                        return new KeyValuePairImpl<O>(CharSequences.toString(input.subSequence(0, charsMatched)), currentNode.getValue());
                                    }
                                }
                            }
                            return endOfData();
                        }
                    };

Original comment by bethtir...@gmail.com on 7 Oct 2013 at 1:33

GoogleCodeExporter commented 9 years ago
Hi Beth, yes you are correct, this was a bug.

Thanks for identifying this and especially for supplying a patch. I have 
applied your patch, which fixes it.

I've released it as concurrent-trees 2.1.1, it should sync to Maven central in 
a few hours.

Original comment by ni...@npgall.com on 7 Oct 2013 at 9:03