erickguan / cppjieba_rb

Cppjieba Ruby binding
MIT License
17 stars 3 forks source link

[gcc8]discourse cppjieba failure caused 502 error #2

Closed marguerite closed 6 years ago

marguerite commented 6 years ago
ster worker 0: 8644 [discourse]: ../cppjieba/include/cppjieba/Trie.hpp:150: void cppjieba::Trie::CreateTrie(const std::vector >&, const std::vector&): Assertion `keys.size() == valuePointers.size()' failed.

What does this mean?

I saw lots of such messages in puma.err.log.

And every time I post in Chinese, it will trigger one such message

discourse version 2.2.0.beta2~git106.a530606da7

cppjieba_rb version 0.3.0

erickguan commented 6 years ago

looks like loading dictionary fails. Did you try a rebuild?

marguerite commented 6 years ago

Unluckily I am using openSUSE package maintained by darix...

I added some debug codes in lib/search.rb

Looks like data wasn’t segmented. Cppjieba internal failed immediately, no matter what the text is

marguerite commented 6 years ago

I added another debug code in lib/cppjieba_rb/segment.rb

str: 双系统,重装windows后grub没了 @mode: mix @max_word_length: 8 @hmm: true

The str is a topic title, such puma error messages were created when I was replying this topic.

You can test cppjieba_rb with these options

marguerite commented 6 years ago
ruby: ../cppjieba/include/cppjieba/Trie.hpp:150: void cppjieba::Trie::CreateTrie(const std::vector >&, const std::vector&): Assertion `keys.size() == valuePointers.size()' failed.
Aborted (core dumped)

This is my cppjieba_test.rb:

require ‘cppjieba_rb’

str = “双系统,重装windows后grub没了”

p CppjiebaRb.segment(str, mode: :mix)

Looks like openSUSE’s ruby2.5-rubygem-cppjieba_rb has some problems, here’s its build log:

https://build.opensuse.org/package/live_build_log/home:darix:apps/rubygem-cppjieba_rb/openSUSE_Tumbleweed/x86_64

We are using gcc8 in tumbleweed now, and I saw some warnings like “implicit declaration of functions” which should not be safely ignored in my view.

Can you please take a look at it?

erickguan commented 6 years ago

Thanks for the details.

We are using gcc8 in tumbleweed now, and I saw some warnings like “implicit declaration of functions” which should not be safely ignored in my view. gcc8 should be just fine along with this warning. Is the locale being set correctly? Also can you run tests when building?

marguerite commented 6 years ago

@fantasticfears

my discourse's default locale is "中文". and the rake test returns the same error.

I debug the cppjieba_rb compiled on Tumbleweed w/ gcc8 with gdb, and here's the backtrace:

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007ffff6d384e9 in __GI_abort () at abort.c:79
#2  0x00007ffff6d383c1 in __assert_fail_base (fmt=0x7ffff6e9c0f0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x7ffff7e660c0 "keys.size() == valuePointers.size()", 
    file=0x7ffff7e663e8 "/home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/include/cppjieba/Trie.hpp", line=150, function=) at assert.c:92
#3  0x00007ffff6d476f2 in __GI___assert_fail (assertion=assertion@entry=0x7ffff7e660c0 "keys.size() == valuePointers.size()", 
    file=file@entry=0x7ffff7e663e8 "/home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/include/cppjieba/Trie.hpp", line=line@entry=150, 
    function=function@entry=0x7ffff7e66b60 , std::allocator > > const&, std::vector > const&)::__PRETTY_FUNCTION__> "void cppjieba::Trie::CreateTrie(const std::vector >&, const std::vector&)") at assert.c:101
#4  0x00007ffff7e57e34 in cppjieba::Trie::CreateTrie (valuePointers=std::vector of length 348986, capacity 524288 = {...}, keys=std::vector of length -1599287958651, capacity -1599288096385 = {...}, 
    this=0x78e1f8) at /usr/include/c++/8/ext/new_allocator.h:86
#5  cppjieba::Trie::Trie (valuePointers=std::vector of length 348986, capacity 524288 = {...}, keys=std::vector of length -1599287958651, capacity -1599288096385 = {...}, this=0x78e188)
    at /home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/include/cppjieba/Trie.hpp:55
#6  cppjieba::DictTrie::CreateTrie (dictUnits=std::vector of length -1099511627664, capacity 66665360945910383 = {...}, this=0x7fffffffc800)
    at /home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/include/cppjieba/DictTrie.hpp:182
#7  cppjieba::DictTrie::Init (user_word_weight_opt=cppjieba::DictTrie::WordWeightMedian, user_dict_paths="\320\017m\000\000\000\000\000ou/Dev/cppjieba_rb/lib/../ext/cppjieba/dict/user.dict.utf8", 
    dict_path="\000\017w\000\000\000\000\000ou/Dev/cppjieba_rb/lib/../ext/cppjieba/dict/jieba.dict.utf8", this=0x7fffffffc800)
    at /home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/include/cppjieba/DictTrie.hpp:170
#8  cppjieba::DictTrie::DictTrie (user_word_weight_opt=cppjieba::DictTrie::WordWeightMedian, user_dict_paths="\320\017m\000\000\000\000\000ou/Dev/cppjieba_rb/lib/../ext/cppjieba/dict/user.dict.utf8", 
    dict_path="\000\017w\000\000\000\000\000ou/Dev/cppjieba_rb/lib/../ext/cppjieba/dict/jieba.dict.utf8", this=0x7fffffffc800)
    at /home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/include/cppjieba/DictTrie.hpp:36
#9  cppjieba::Jieba::Jieba (stopWordPath="", 
    idfPath="\320\312\006\005\000\000\000\000\325\005\000\000\000\000\000\000@q\a\005\000\000\000\000\377\004\000\000\000\000\000\000\000\000\200?\000\000\000\000\325\005", '\000' , "A\000\000", user_dict_path="\320\017m\000\000\000\000\000ou/Dev/cppjieba_rb/lib/../ext/cppjieba/dict/user.dict.utf8", 
    model_path="\260\277q\000\000\000\000\000ou/Dev/cppjieba_rb/lib/../ext/cppjieba/dict/hmm_model.utf8", dict_path="\000\017w\000\000\000\000\000ou/Dev/cppjieba_rb/lib/../ext/cppjieba/dict/jieba.dict.utf8", 
    this=0x7fffffffc800) at /home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/include/cppjieba/Jieba.hpp:23
#10 internal_initialize (self=, dict_path=, model_path=, user_dict_path=, idf_path=, stop_word_path=)
    at ../../../../ext/cppjieba_rb/internal.cc:71
#11 0x00007ffff7cd5b26 in vm_call0_cfunc_with_frame (ci=, cc=0x7ffff7f6cef0, argv=0x7ffff7e6d0d0, calling=, ec=0x407a98) at vm_eval.c:85
#12 vm_call0_cfunc (argv=0x7ffff7e6d0d0, cc=0x7ffff7f6cef0, ci=, calling=0x7fffffffca90, ec=0x407a98) at vm_eval.c:100
#13 vm_call0_body (ec=ec@entry=0x407a98, calling=calling@entry=0x7fffffffcb40, ci=ci@entry=0x7fffffffcb30, cc=cc@entry=0x7fffffffcb60, argv=argv@entry=0x7ffff7e6d0d0) at vm_eval.c:131
#14 0x00007ffff7cd633f in vm_call0 (me=, argv=0x7ffff7e6d0d0, argc=5, id=3057, recv=7559800, ec=0x407a98) at vm_eval.c:58
#15 rb_call0 (ec=0x407a98, recv=recv@entry=7559800, mid=mid@entry=3057, argc=5, argc@entry=3057, argv=0x7ffff7e6d0d0, argv@entry=0x5, scope=scope@entry=CALL_FCALL, self=7621880) at vm_eval.c:296
#16 0x00007ffff7cd69ce in rb_call (scope=CALL_FCALL, argv=0x5, argc=3057, mid=3057, recv=7559800) at vm_eval.c:589
#17 rb_funcallv (recv=recv@entry=7559800, mid=mid@entry=3057, argc=argc@entry=5, argv=argv@entry=0x7ffff7e6d0d0) at vm_eval.c:815
#18 0x00007ffff7b78166 in rb_obj_call_init (obj=obj@entry=7559800, argc=argc@entry=5, argv=argv@entry=0x7ffff7e6d0d0) at eval.c:1589
#19 0x00007ffff7bdbb71 in rb_class_s_new (argc=5, argv=0x7ffff7e6d0d0, klass=) at object.c:2152
#20 0x00007ffff7cc2269 in vm_call_cfunc_with_frame (ci=0x5ea990, cc=, calling=, reg_cfp=0x7ffff7f6cf20, ec=0x407a98) at vm_insnhelper.c:1918
#21 vm_call_cfunc (ec=ec@entry=0x407a98, reg_cfp=reg_cfp@entry=0x7ffff7f6cf20, calling=calling@entry=0x7fffffffce00, ci=ci@entry=0x5ea990, cc=) at vm_insnhelper.c:1934
#22 0x00007ffff7cd398c in vm_call_method_each_type (ec=ec@entry=0x407a98, cfp=cfp@entry=0x7ffff7f6cf20, calling=0x7fffffffce00, ci=0x5ea990, cc=) at vm_insnhelper.c:2232
#23 0x00007ffff7cd3fdb in vm_call_method_each_type (cc=, ci=, calling=, cfp=, ec=) at vm_insnhelper.c:2381
#24 vm_call_method (ec=0x407a98, cfp=0x7ffff7f6cf20, calling=, ci=, cc=) at vm_insnhelper.c:2381
#25 0x00007ffff7ccd346 in vm_exec_core (ec=0x2, ec@entry=0x407a98, initial=140737488339856, initial@entry=0) at insns.def:915
#26 0x00007ffff7cd1bdd in vm_exec (ec=0x407a98) at vm.c:1778
#27 0x00007ffff7cd585b in rb_iseq_eval_main (iseq=iseq@entry=0x7523f8) at vm.c:2026
#28 0x00007ffff7b733e4 in ruby_exec_internal (n=0x7523f8) at eval.c:246
#29 0x00007ffff7b7524d in ruby_exec_node (n=, n@entry=0x7523f8) at eval.c:310
#30 0x00007ffff7b77bce in ruby_run_node (n=0x7523f8) at eval.c:302
#31 0x00000000004010db in main (argc=, argv=) at ./main.c:42

hope it helps

marguerite commented 6 years ago

I can confirm its a gcc8 issue:

I explicitly add:

CONFIG["CC"] = "/usr/bin/gcc-7"
CONFIG["CXX"] = "/usr/bin/g++-7"

in ext/cppjieba_rb/extconf.rb

And rake test runs successfully.

here's gcc7's build log:

mkdir -p tmp/x86_64-linux/cppjieba_rb/2.5.0
cd tmp/x86_64-linux/cppjieba_rb/2.5.0
/home/zhou/.rvm/rubies/ruby-2.5.0/bin/ruby -I. ../../../../ext/cppjieba_rb/extconf.rb
creating Makefile
cd -
cd tmp/x86_64-linux/cppjieba_rb/2.5.0
/usr/bin/gmake
compiling ../../../../ext/cppjieba_rb/cppjieba_rb.c
../../../../ext/cppjieba_rb/cppjieba_rb.c: In function ‘Init_cppjieba_rb’:
../../../../ext/cppjieba_rb/cppjieba_rb.c:9:5: warning: implicit declaration of function ‘Init_internal’; did you mean ‘rb_intern2’? [-Wimplicit-function-declaration]
     Init_internal();
     ^~~~~~~~~~~~~
     rb_intern2
../../../../ext/cppjieba_rb/cppjieba_rb.c: At top level:
cc1: warning: unrecognized command line option ‘-Wno-self-assign’
cc1: warning: unrecognized command line option ‘-Wno-constant-logical-operand’
cc1: warning: unrecognized command line option ‘-Wno-parentheses-equality’
compiling ../../../../ext/cppjieba_rb/internal.cc
cc1plus: warning: command line option ‘-Wimplicit-int’ is valid for C/ObjC but not for C++
cc1plus: warning: command line option ‘-Wdeclaration-after-statement’ is valid for C/ObjC but not for C++
cc1plus: warning: command line option ‘-Wimplicit-function-declaration’ is valid for C/ObjC but not for C++
In file included from /home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/include/cppjieba/DictTrie.hpp:13:0,
                 from /home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/include/cppjieba/QuerySegment.hpp:8,
                 from /home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/include/cppjieba/Jieba.hpp:4,
                 from ../../../../ext/cppjieba_rb/internal.cc:8:
/home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/deps/limonp/StringUtil.hpp: In function ‘std::__cxx11::string limonp::StringFormat(const char*, ...)’:
/home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/deps/limonp/StringUtil.hpp:35:57: warning: function ‘std::__cxx11::string limonp::StringFormat(const char*, ...)’ might be a candidate for ‘gnu_printf’ format attribute [-Wsuggest-attribute=format]
     int n = vsnprintf((char *)str.c_str(), size, fmt, ap);
                                                         ^
../../../../ext/cppjieba_rb/internal.cc: In function ‘VALUE internal_initialize(VALUE, VALUE, VALUE, VALUE, VALUE, VALUE)’:
../../../../ext/cppjieba_rb/internal.cc:79:1: warning: control reaches end of non-void function [-Wreturn-type]
 }
 ^
At global scope:
cc1plus: warning: unrecognized command line option ‘-Wno-self-assign’
cc1plus: warning: unrecognized command line option ‘-Wno-constant-logical-operand’
cc1plus: warning: unrecognized command line option ‘-Wno-parentheses-equality’
linking shared-object cppjieba_rb/cppjieba_rb.so

and here's gcc8's:

mkdir -p tmp/x86_64-linux/cppjieba_rb/2.5.0
cd tmp/x86_64-linux/cppjieba_rb/2.5.0
/home/zhou/.rvm/rubies/ruby-2.5.0/bin/ruby -I. ../../../../ext/cppjieba_rb/extconf.rb
creating Makefile
cd -
cd tmp/x86_64-linux/cppjieba_rb/2.5.0
/usr/bin/gmake
compiling ../../../../ext/cppjieba_rb/cppjieba_rb.c
../../../../ext/cppjieba_rb/cppjieba_rb.c: In function ‘Init_cppjieba_rb’:
../../../../ext/cppjieba_rb/cppjieba_rb.c:9:5: warning: implicit declaration of function ‘Init_internal’; did you mean ‘rb_intern2’? [-Wimplicit-function-declaration]
     Init_internal();
     ^~~~~~~~~~~~~
     rb_intern2
../../../../ext/cppjieba_rb/cppjieba_rb.c: At top level:
cc1: warning: unrecognized command line option ‘-Wno-self-assign’
cc1: warning: unrecognized command line option ‘-Wno-constant-logical-operand’
cc1: warning: unrecognized command line option ‘-Wno-parentheses-equality’
compiling ../../../../ext/cppjieba_rb/internal.cc
cc1plus: warning: command line option ‘-Wimplicit-int’ is valid for C/ObjC but not for C++
cc1plus: warning: command line option ‘-Wdeclaration-after-statement’ is valid for C/ObjC but not for C++
cc1plus: warning: command line option ‘-Wimplicit-function-declaration’ is valid for C/ObjC but not for C++
In file included from /home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/include/cppjieba/DictTrie.hpp:13,
                 from /home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/include/cppjieba/QuerySegment.hpp:8,
                 from /home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/include/cppjieba/Jieba.hpp:4,
                 from ../../../../ext/cppjieba_rb/internal.cc:8:
/home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/deps/limonp/StringUtil.hpp: In function ‘std::__cxx11::string limonp::StringFormat(const char*, ...)’:
/home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/deps/limonp/StringUtil.hpp:35:57: warning: function ‘std::__cxx11::string limonp::StringFormat(const char*, ...)’ might be a candidate for ‘gnu_printf’ format attribute [-Wsuggest-attribute=format]
     int n = vsnprintf((char *)str.c_str(), size, fmt, ap);
                                                         ^
In file included from /home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/include/cppjieba/Unicode.hpp:9,
                 from /home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/include/cppjieba/DictTrie.hpp:15,
                 from /home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/include/cppjieba/QuerySegment.hpp:8,
                 from /home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/include/cppjieba/Jieba.hpp:4,
                 from ../../../../ext/cppjieba_rb/internal.cc:8:
/home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/deps/limonp/LocalVector.hpp: In instantiation of ‘void limonp::LocalVector::reserve(size_t) [with T = std::pair; size_t = long unsigned int]’:
/home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/deps/limonp/LocalVector.hpp:83:7:   required from ‘void limonp::LocalVector::push_back(const T&) [with T = std::pair]’
/home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/include/cppjieba/Trie.hpp:99:81:   required from here
/home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/deps/limonp/LocalVector.hpp:95:11: warning: ‘void* memcpy(void*, const void*, size_t)’ writing to an object of type ‘struct std::pair’ with no trivial copy-assignment; use copy-assignment or copy-initialization instead [-Wclass-memaccess]
     memcpy(ptr_, old, sizeof(T) * capacity_);
     ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /usr/include/c++/8/bits/stl_algobase.h:64,
                 from /usr/include/c++/8/bits/char_traits.h:39,
                 from /usr/include/c++/8/string:40,
                 from ../../../../ext/cppjieba_rb/internal.cc:4:
/usr/include/c++/8/bits/stl_pair.h:198:12: note: ‘struct std::pair’ declared here
     struct pair
            ^~~~
In file included from /home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/include/cppjieba/Unicode.hpp:9,
                 from /home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/include/cppjieba/DictTrie.hpp:15,
                 from /home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/include/cppjieba/QuerySegment.hpp:8,
                 from /home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/include/cppjieba/Jieba.hpp:4,
                 from ../../../../ext/cppjieba_rb/internal.cc:8:
/home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/deps/limonp/LocalVector.hpp: In instantiation of ‘limonp::LocalVector& limonp::LocalVector::operator=(const limonp::LocalVector&) [with T = std::pair]’:
/home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/deps/limonp/LocalVector.hpp:33:11:   required from ‘limonp::LocalVector::LocalVector(const limonp::LocalVector&) [with T = std::pair]’
/home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/include/cppjieba/Trie.hpp:28:8:   required from ‘void std::_Construct(_T1*, _Args&& ...) [with _T1 = cppjieba::Dag; _Args = {const cppjieba::Dag&}]’
/usr/include/c++/8/bits/stl_uninitialized.h:83:18:   required from ‘static _ForwardIterator std::__uninitialized_copy<_TrivialValueTypes>::__uninit_copy(_InputIterator, _InputIterator, _ForwardIterator) [with _InputIterator = const cppjieba::Dag*; _ForwardIterator = cppjieba::Dag*; bool _TrivialValueTypes = false]’
/usr/include/c++/8/bits/stl_uninitialized.h:134:15:   required from ‘_ForwardIterator std::uninitialized_copy(_InputIterator, _InputIterator, _ForwardIterator) [with _InputIterator = const cppjieba::Dag*; _ForwardIterator = cppjieba::Dag*]’
/usr/include/c++/8/bits/stl_uninitialized.h:289:37:   required from ‘_ForwardIterator std::__uninitialized_copy_a(_InputIterator, _InputIterator, _ForwardIterator, std::allocator<_Tp>&) [with _InputIterator = const cppjieba::Dag*; _ForwardIterator = cppjieba::Dag*; _Tp = cppjieba::Dag]’
/usr/include/c++/8/bits/stl_uninitialized.h:311:2:   required from ‘_ForwardIterator std::__uninitialized_move_if_noexcept_a(_InputIterator, _InputIterator, _ForwardIterator, _Allocator&) [with _InputIterator = cppjieba::Dag*; _ForwardIterator = cppjieba::Dag*; _Allocator = std::allocator]’
/usr/include/c++/8/bits/vector.tcc:611:44:   required from ‘void std::vector<_Tp, _Alloc>::_M_default_append(std::vector<_Tp, _Alloc>::size_type) [with _Tp = cppjieba::Dag; _Alloc = std::allocator; std::vector<_Tp, _Alloc>::size_type = long unsigned int]’
/usr/include/c++/8/bits/stl_vector.h:827:4:   required from ‘void std::vector<_Tp, _Alloc>::resize(std::vector<_Tp, _Alloc>::size_type) [with _Tp = cppjieba::Dag; _Alloc = std::allocator; std::vector<_Tp, _Alloc>::size_type = long unsigned int]’
/home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/include/cppjieba/Trie.hpp:86:27:   required from here
/home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/deps/limonp/LocalVector.hpp:63:13: warning: ‘void* memcpy(void*, const void*, size_t)’ writing to an object of type ‘struct std::pair’ with no trivial copy-assignment; use copy-assignment or copy-initialization instead [-Wclass-memaccess]
       memcpy(ptr_, vec.ptr_, vec.size() * sizeof(T));
       ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /usr/include/c++/8/bits/stl_algobase.h:64,
                 from /usr/include/c++/8/bits/char_traits.h:39,
                 from /usr/include/c++/8/string:40,
                 from ../../../../ext/cppjieba_rb/internal.cc:4:
/usr/include/c++/8/bits/stl_pair.h:198:12: note: ‘struct std::pair’ declared here
     struct pair
            ^~~~
../../../../ext/cppjieba_rb/internal.cc: In function ‘VALUE internal_initialize(VALUE, VALUE, VALUE, VALUE, VALUE, VALUE)’:
../../../../ext/cppjieba_rb/internal.cc:73:54: warning: control reaches end of non-void function [-Wreturn-type]
     std::ifstream ifs(StringValueCStr(stop_word_path));
                                                      ^
../../../../ext/cppjieba_rb/internal.cc:59:7: warning: function might be candidate for attribute ‘noreturn’ [-Wsuggest-attribute=noreturn]
 VALUE internal_initialize(VALUE self,
       ^~~~~~~~~~~~~~~~~~~
At global scope:
cc1plus: warning: unrecognized command line option ‘-Wno-self-assign’
cc1plus: warning: unrecognized command line option ‘-Wno-constant-logical-operand’
cc1plus: warning: unrecognized command line option ‘-Wno-parentheses-equality’
linking shared-object cppjieba_rb/cppjieba_rb.so

hope it helps

erickguan commented 6 years ago

The build log is identical and https://travis-ci.org/fantasticfears/cppjieba_rb a new build is completed with Ruby 2.5.1.

Here it's something interesting.

#4  0x00007ffff7e57e34 in cppjieba::Trie::CreateTrie (valuePointers=std::vector of length 348986, capacity 524288 = {...}, keys=std::vector of length -1599287958651, capacity -1599288096385 = {...}, 
    this=0x78e1f8) at /usr/include/c++/8/ext/new_allocator.h:86
#5  cppjieba::Trie::Trie (valuePointers=std::vector of length 348986, capacity 524288 = {...}, keys=std::vector of length -1599287958651, capacity -1599288096385 = {...}, this=0x78e188)
    at /home/zhou/Dev/cppjieba_rb/ext/cppjieba_rb/../cppjieba/include/cppjieba/Trie.hpp:55

This vector keys might overflowed. So in DictTire.hpp:95, either the unicode class wrote overflowed for some reason or STL shipped has some problems. Valgrind might give you mode details.

marguerite commented 6 years ago

@fantasticfears

I was about to debug it with valgrind, but interestingly after I changed the CXXFLAGS from O3 to O0, those tests ran again.

Since I can’t reproduce the problem with O0, there’s no way to valgrind it.

I played around with other optimize options, O1 caused a different problem, while O2 and O3 all reproduced our current issue.

Any idea?

marguerite commented 6 years ago

I ran:

valgrind --tool=memcheck --leak-check=full --show-leak-kinds=all --log-file=./valgrind.log --num-callers=15 --track-fds=yes --trace-children=yes rake test

and generate a 19mb valgrind.log:

https://transfer.sh/QaL2Q/valgrind.log

erickguan commented 6 years ago

There is nothing in this log. My guess would be GCC8 purge the local scope. However, it's still not reasonable to me. Maybe it's a upstream problem.

marguerite commented 6 years ago

https://transfer.sh/Jw1KQ/valgrind.log

What about this? I ran valgrind against my cppjieba_test.rb, and I saw some output for Internal_initialize, so it might help.

erickguan commented 6 years ago

Can you try compile this and run a demo under same environment? https://github.com/yanyiwu/cppjieba

marguerite commented 6 years ago

@fantasticfears

It’s weird. I compiled and ran the demo successfully without any segfault...

So the problem is not in cppjieba’s codebase...

erickguan commented 6 years ago

That's really good. We know it's my implementation problem. I've fixed the possible warning and it passed the task with llvm-7.

marguerite commented 6 years ago

https://github.com/fantasticfears/cppjieba_rb/commit/aa6ef94fff1607744fbaf3fed554a53273b089c7

This commit fixed this issue. The “return self” code. Although I don’t know why openSUSE’s gcc acts like that, but well, this is the fix.

erickguan commented 6 years ago

C compiler runs indefinitely with undefined behavior. There's no need to understand that. Thanks for the bug report and your time.