luikore / triez

fast, efficient, unicode aware HAT trie with prefix / suffix support for Ruby
MIT License
140 stars 9 forks source link

Segfault on search_with_prefix when nodes >16,384 #3

Closed canadaduane closed 10 years ago

canadaduane commented 10 years ago
require 'triez'

t = Triez.new
16_385.times{ |i| t["a#{i}"] = rand(10)+1 }
t.search_with_prefix("a")

I'm running on Mac OS 10.9.4 with ruby 1.9.3p194. If I run the above with 16_384.times all is well. 16_385.times yields the following:

/Users/duane/.rbenv/versions/1.9.3-p194/lib/ruby/gems/1.9.1/gems/triez-1.0.2/lib/triez.rb:61: [BUG] Segmentation fault
ruby 1.9.3p194 (2012-04-20 revision 35410) [x86_64-darwin12.1.0]

-- Control frame information -----------------------------------------------
c:0005 p:---- s:0023 b:0023 l:000022 d:000022 CFUNC  :_internal_search
c:0004 p:0193 s:0016 b:0016 l:001728 d:001728 METHOD /Users/duane/.rbenv/versions/1.9.3-p194/lib/ruby/gems/1.9.1/gems/triez-1.0.2/lib/triez.rb:61
c:0003 p:0054 s:0007 b:0007 l:000518 d:0000a0 EVAL   bug.rb:7
c:0002 p:---- s:0004 b:0004 l:000003 d:000003 FINISH
c:0001 p:0000 s:0002 b:0002 l:000518 d:000518 TOP   

-- Ruby level backtrace information ----------------------------------------
bug.rb:7:in `<main>'
/Users/duane/.rbenv/versions/1.9.3-p194/lib/ruby/gems/1.9.1/gems/triez-1.0.2/lib/triez.rb:61:in `search_with_prefix'
/Users/duane/.rbenv/versions/1.9.3-p194/lib/ruby/gems/1.9.1/gems/triez-1.0.2/lib/triez.rb:61:in `_internal_search'

-- C level backtrace information -------------------------------------------

   See Crash Report log file under ~/Library/Logs/CrashReporter or
   /Library/Logs/CrashReporter, for the more detail of.

-- Other runtime information -----------------------------------------------

* Loaded script: bug.rb

* Loaded features:

    0 enumerator.so
    1 /Users/duane/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/x86_64-darwin12.1.0/enc/encdb.bundle
    2 /Users/duane/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/x86_64-darwin12.1.0/enc/trans/transdb.bundle
    3 /Users/duane/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/rubygems/defaults.rb
    4 /Users/duane/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/x86_64-darwin12.1.0/rbconfig.rb
    5 /Users/duane/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/rubygems/deprecate.rb
    6 /Users/duane/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/rubygems/exceptions.rb
    7 /Users/duane/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/rubygems/custom_require.rb
    8 /Users/duane/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/rubygems.rb
    9 /Users/duane/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/rubygems/version.rb
   10 /Users/duane/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/rubygems/requirement.rb
   11 /Users/duane/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/rubygems/platform.rb
   12 /Users/duane/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/rubygems/specification.rb
   13 /Users/duane/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/rubygems/path_support.rb
   14 /Users/duane/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/rubygems/dependency.rb
   15 /Users/duane/.rbenv/versions/1.9.3-p194/lib/ruby/gems/1.9.1/gems/triez-1.0.2/ext/triez.bundle
   16 /Users/duane/.rbenv/versions/1.9.3-p194/lib/ruby/gems/1.9.1/gems/triez-1.0.2/lib/triez.rb

[NOTE]
You may have encountered a bug in the Ruby interpreter or extension libraries.
Bug reports are welcome.
For details: http://www.ruby-lang.org/bugreport.html

Abort trap: 6
canadaduane commented 10 years ago

Narrowed the cause down to a null node or stack:

static void hattrie_iter_nextnode(hattrie_iter_t* i)
{
    if (i->stack == NULL) return;

    /* pop the stack */
    node_ptr node;
    hattrie_node_stack_t* next;
    unsigned char   c;
    size_t level;

    node  = i->stack->node;

at this point, node is 0 (NULL).

canadaduane commented 10 years ago

Or rather, node.flag is NULL:

    node  = i->stack->node;
    next  = i->stack->next;
    c     = i->stack->c;
    level = i->stack->level;

    free(i->stack);
    i->stack = next;

    assert(node.flag != NULL);
    if (*node.flag & NODE_TYPE_TRIE) {
Assertion failed: (node.flag != NULL), function hattrie_iter_nextnode, file ../hat-trie/hat-trie.c, line 565.
luikore commented 10 years ago

Thanks for reporting, I'm looking into it.

canadaduane commented 10 years ago

Not sure if this is helpful, but I commented out node.flag = NULL; on line 127 of hat-trie.c and found that the segfault is fixed. However, searching a Triez with more than 16,384 nodes will return zero results (<=16,384 nodes does return results)

canadaduane commented 10 years ago

Any further progress? I'm still trying to track it down. If you have any ideas, I'd be happy to hear and possibly help (I'm still trying to understand how it all works, however).

luikore commented 10 years ago

Sorry for the late response. I got other work to do in the last days... Now I'm looking back this.

The flag field is the first byte in tree_node_t.xs, which tags whether the node has value, and the type of the data chunk. The types are:

The logic inside is quite complex...

Seems hattrie_find(hattrie_t* T, const char **key, size_t *len) has set the flag to NULL due to some unkown reason? I'm still looking...

luikore commented 10 years ago

I got it. It is because I hattrie_find() sets node.flag = NULL for nodes that doesn't contain value, I reused it for prefixed-iteration without aware of that field could be NULL! Thanks very much for finding it out!

The fix: https://github.com/luikore/hat-trie/commit/85bb89464a6a4be10b3f7449a69c75d463f52098

I'm bumping a new version for the gem, plz see if it works :)

If you are compiling from source, plz run rake glob_src again.

canadaduane commented 10 years ago

So this appears to fix the segfault, but it returns incorrect data. i.e. it doesn't return any data for >16,384 nodes:

require 'triez'
t = Triez.new
3.times{ |i| t["a#{i}"] = rand(10)+1 }
p t.search_with_prefix("a")
# [["1", 2], ["2", 3], ["0", 3]]
require 'triez'
t = Triez.new
16_385.times{ |i| t["a#{i}"] = rand(10)+1 }
p t.search_with_prefix("a")
# []
luikore commented 10 years ago

Sorry, my bad for not checking further correctness. Just pushed the fix: https://github.com/luikore/hat-trie/commit/4556cf4e7fcf890c5a53dd7c29b615311cebeead

hope it works :hourglass:

canadaduane commented 10 years ago

Awesome! Looks great here. [SNIP: moving other comments to separate issue]

luikore commented 10 years ago

:)