reverbrain / elliptics

Distributed storage for medium and large objects with streaming support
http://reverbrain.com/elliptics
GNU Lesser General Public License v3.0
496 stars 85 forks source link

[CentOS 6] SIGSEGV, Segmentation fault on node start #689

Closed agend closed 8 years ago

agend commented 8 years ago

Looks like some variable isn't initialized. Environment: Centos 6.5 x64 Elliptics version 2.26.10.1

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffeabae5700 (LWP 14022)]
WriteString (this=0x7ffeabadcf30, handler=...) at /usr/include/handystats/rapidjson/writer.h:198
198                             const Ch c = is.Peek();
(gdb) list 181,198
181             void WriteString(const Ch* str, SizeType length)  {
182                     static const char hexDigits[16] = { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F' };
183                     static const char escape[256] = {
184     #define Z16 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
185                             //0    1    2    3    4    5    6    7    8    9    A    B    C    D    E    F
186                             'u', 'u', 'u', 'u', 'u', 'u', 'u', 'u', 'b', 't', 'n', 'u', 'f', 'r', 'u', 'u', // 00
187                             'u', 'u', 'u', 'u', 'u', 'u', 'u', 'u', 'u', 'u', 'u', 'u', 'u', 'u', 'u', 'u', // 10
188                               0,   0, '"',   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0, // 20
189                             Z16, Z16,                                                                                                                                       // 30~4F
190                               0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,'\\',   0,   0,   0, // 50
191                             Z16, Z16, Z16, Z16, Z16, Z16, Z16, Z16, Z16, Z16                                                                // 60~FF
192     #undef Z16
193                     };
194
195                     os_.Put('\"');
196                     GenericStringStream<SourceEncoding> is(str);
197                     while (is.Tell() < length) {
198                             const Ch c = is.Peek();
(gdb) print str
$2 = 0xd00007fff <Address 0xd00007fff out of bounds>
(gdb) bt
#0  WriteString (this=0x7ffeabadcf30, handler=...) at /usr/include/handystats/rapidjson/writer.h:198
#1  String (this=0x7ffeabadcf30, handler=...) at /usr/include/handystats/rapidjson/writer.h:54
#2  rapidjson::GenericValue<rapidjson::UTF8<char>, rapidjson::MemoryPoolAllocator<rapidjson::CrtAllocator> >::Accept<rapidjson::Writer<rapidjson::GenericStringBuffer<rapidjson::UTF8<char>, rapidjson::CrtAllocator>, rapidjson::UTF8<char>, rapidjson::UTF8<char>, rapidjson::MemoryPoolAllocator<rapidjson::CrtAllocator> > > (this=0x7ffeabadcf30, handler=...) at /usr/include/handystats/rapidjson/document.h:530
#3  0x00007ffff73a3c18 in eblob_json_commit (b=0x7fff780023c0) at /builds/group-d/eblob-original-id-mod/rpmbuild/BUILD/eblob-0.23.12/library/json_stat.cpp:281
#4  0x00007ffff73856ae in eblob_periodic (b=0x7fff780023c0) at /builds/group-d/eblob-original-id-mod/rpmbuild/BUILD/eblob-0.23.12/library/blob.c:2902
#5  0x00007ffff7386370 in eblob_periodic_thread (data=0x7fff780023c0) at /builds/group-d/eblob-original-id-mod/rpmbuild/BUILD/eblob-0.23.12/library/blob.c:2867
#6  0x00007ffff6868851 in start_thread () from /lib64/libpthread.so.0
#7  0x00007ffff512790d in clone () from /lib64/libc.so.6
agend commented 8 years ago

Any news on this issue?

abudnik commented 8 years ago

You can send a patch.

agend commented 8 years ago

I hoped it is known issue) I think it's specific to CentOS 6. Didn't you have this issue for other platform?

abudnik commented 8 years ago

No, we haven't such issue anywhere.

bioothod commented 8 years ago
(gdb) print str
$2 = 0xd00007fff <Address 0xd00007fff out of bounds>

This looks like 32-bits shifted address from the trace, which probably indicates a bug in rapidjson memory allocator. Do you have addresses of other structures in higher frames which might help finding out where this strange address came from?

agend commented 8 years ago

@bioothod sure.

Backtrace:

(gdb) bt
#0  WriteString (this=0x7ffeabadcf30, handler=...) at /usr/include/handystats/rapidjson/writer.h:198
#1  String (this=0x7ffeabadcf30, handler=...) at /usr/include/handystats/rapidjson/writer.h:54
#2  rapidjson::GenericValue<rapidjson::UTF8<char>, rapidjson::MemoryPoolAllocator<rapidjson::CrtAllocator> >::Accept<rapidjson::Writer<rapidjson::GenericStringBuffer<rapidjson::UTF8<char>, rapidjson::CrtAllocator>, rapidjson::UTF8<char>, rapidjson::UTF8<char>, rapidjson::MemoryPoolAllocator<rapidjson::CrtAllocator> > > (this=0x7ffeabadcf30, handler=...) at /usr/include/handystats/rapidjson/document.h:530
#3  0x00007ffff73a3c18 in eblob_json_commit (b=0x7fff780023c0) at /builds/group-d/eblob-original-id-mod/rpmbuild/BUILD/eblob-0.23.12/library/json_stat.cpp:281
#4  0x00007ffff73856ae in eblob_periodic (b=0x7fff780023c0) at /builds/group-d/eblob-original-id-mod/rpmbuild/BUILD/eblob-0.23.12/library/blob.c:2902
#5  0x00007ffff7386370 in eblob_periodic_thread (data=0x7fff780023c0) at /builds/group-d/eblob-original-id-mod/rpmbuild/BUILD/eblob-0.23.12/library/blob.c:2867
#6  0x00007ffff6868851 in start_thread () from /lib64/libpthread.so.0
#7  0x00007ffff512790d in clone () from /lib64/libc.so.6

Frame 0:

#0  WriteString (this=0x7ffeabadcf30, handler=...) at /usr/include/handystats/rapidjson/writer.h:198
198                             const Ch c = is.Peek();
(gdb) list
193                     };
194
195                     os_.Put('\"');
196                     GenericStringStream<SourceEncoding> is(str);
197                     while (is.Tell() < length) {
198                             const Ch c = is.Peek();
199                             if ((sizeof(Ch) == 1 || (unsigned)c < 256) && escape[(unsigned char)c])  {
200                                     is.Take();
201                                     os_.Put('\\');
202                                     os_.Put(escape[(unsigned char)c]);

Frame 1:

(gdb) frame 1
#1  String (this=0x7ffeabadcf30, handler=...) at /usr/include/handystats/rapidjson/writer.h:54
54                      WriteString(str, length);
(gdb) list
49              Writer& Double(double d)                { Prefix(kNumberType); WriteDouble(d);          return *this; }
50
51              Writer& String(const Ch* str, SizeType length, bool copy = false) {
52                      (void)copy;
53                      Prefix(kStringType);
54                      WriteString(str, length);
55                      return *this;
56              }
57
58              Writer& StartObject() {

Frame 2:

(gdb) frame 2
#2  rapidjson::GenericValue<rapidjson::UTF8<char>, rapidjson::MemoryPoolAllocator<rapidjson::CrtAllocator> >::Accept<rapidjson::Writer<rapidjson::GenericStringBuffer<rapidjson::UTF8<char>, rapidjson::CrtAllocator>, rapidjson::UTF8<char>, rapidjson::UTF8<char>, rapidjson::MemoryPoolAllocator<rapidjson::CrtAllocator> > > (this=0x7ffeabadcf30, handler=...) at /usr/include/handystats/rapidjson/document.h:530
530                                     handler.String(m->name.data_.s.str, m->name.data_.s.length, false);
(gdb) list
525                     case kTrueType:         handler.Bool(true); break;
526
527                     case kObjectType:
528                             handler.StartObject();
529                             for (Member* m = data_.o.members; m != data_.o.members + data_.o.size; ++m) {
530                                     handler.String(m->name.data_.s.str, m->name.data_.s.length, false);
531                                     m->value.Accept(handler);
532                             }
533                             handler.EndObject(data_.o.size);
534                             break;

Problem in the last item in array data_.o.members

(gdb) print *data_.o.members
$14 = {name = {static kDefaultArrayCapacity = <optimized out>, static kDefaultObjectCapacity = <optimized out>, data_ = {s = {str = 0x7ffff73afb75 "global_stats", length = 12, hashcode = 0}, n = {i = {i = -147129483, padding = "\377\177\000"}, u = {u = 4147837813,
          padding2 = "\377\177\000"}, i64 = 140737341225845, u64 = 140737341225845, d = 6.9533485386727003e-310}, o = {members = 0x7ffff73afb75, size = 12, capacity = 0}, a = {elements = 0x7ffff73afb75, size = 12, capacity = 0}}, flags_ = 1048581}, value = {
    static kDefaultArrayCapacity = <optimized out>, static kDefaultObjectCapacity = <optimized out>, data_ = {s = {str = 0xc00007fff <Address 0xc00007fff out of bounds>, length = 16, hashcode = 3}, n = {i = {i = 32767, padding = "\f\000\000"}, u = {u = 32767,
          padding2 = "\f\000\000"}, i64 = 51539640319, u64 = 51539640319, d = 2.5463965680632285e-313}, o = {members = 0xc00007fff, size = 16, capacity = 3}, a = {elements = 0xc00007fff, size = 16, capacity = 3}}, flags_ = 4147837826}}
bioothod commented 8 years ago

It is not only the last member, but the whole object. Eblob code adds quite a lot of members in that code, but there is only 'global_stat' and this failed pointer

agend commented 8 years ago

@bioothod Is it possible that memory is freed after been passed to Value constructor? https://github.com/reverbrain/eblob/blob/master/library/json_stat.cpp#L39 https://github.com/yandex/handystats/blob/master/include/handystats/rapidjson/document.h#L119

bioothod commented 8 years ago

We use API calls with allocator, this ends up with this rapidjson call: https://github.com/yandex/handystats/blob/master/include/handystats/rapidjson/document.h#L669

In this case memory will not be freed until object itself is destroyed, but in any case, that object we talk about, which is created in eblob_json_commit(), it contains many values, not just 2 named global_stat and something else. Even if that memory is somehow freed, data_.o.members above should contain more than 2 entries.

agend commented 8 years ago

@bioothod size for data_.o is 7 More info about it

(gdb) list
525                     case kTrueType:         handler.Bool(true); break;
526
527                     case kObjectType:
528                             handler.StartObject();
529                             for (Member* m = data_.o.members; m != data_.o.members + data_.o.size; ++m) {
530                                     handler.String(m->name.data_.s.str, m->name.data_.s.length, false);
531                                     m->value.Accept(handler);
532                             }
533                             handler.EndObject(data_.o.size);
534                             break;
(gdb) data_.o.size
Undefined command: "data_".  Try "help".
(gdb) list
535
536                     case kArrayType:
537                             handler.StartArray();
538                             for (GenericValue* v = data_.a.elements; v != data_.a.elements + data_.a.size; ++v)
539                                     v->Accept(handler);
540                             handler.EndArray(data_.a.size);
541                             break;
542
543                     case kStringType:
544                             handler.String(data_.s.str, data_.s.length, false);
(gdb) print data_.o.size
$22 = 7
(gdb) print *data_.o.members@7
$23 = {{name = {static kDefaultArrayCapacity = <optimized out>, static kDefaultObjectCapacity = <optimized out>, data_ = {s = {str = 0x7ffff73afb75 "global_stats", length = 12,
hashcode = 0}, n = {i = {i = -147129483, padding = "\377\177\000"}, u = {u = 4147837813,
            padding2 = "\377\177\000"}, i64 = 140737341225845, u64 = 140737341225845, d = 6.9533485386727003e-310}, o = {members = 0x7ffff73afb75, size = 12, capacity = 0}, a =
{elements = 0x7ffff73afb75, size = 12, capacity = 0}}, flags_ = 1048581}, value = {
      static kDefaultArrayCapacity = <optimized out>, static kDefaultObjectCapacity = <optimized out>, data_ = {s = {str = 0xc00007fff <Address 0xc00007fff out of bounds>, length = 16, hashcode = 3}, n = {i = {i = 32767, padding = "\f\000\000"}, u = {u = 32767,
            padding2 = "\f\000\000"}, i64 = 51539640319, u64 = 51539640319, d = 2.5463965680632285e-313}, o = {members = 0xc00007fff, size = 16, capacity = 3}, a = {elements = 0xc00007fff, size = 16, capacity = 3}}, flags_ = 4147837826}}, {name = {
      static kDefaultArrayCapacity = <optimized out>, static kDefaultObjectCapacity = <optimized out>, data_ = {s = {str = 0x7fff0000000d "", length = 1048581, hashcode = 1879056680}, n = {i = {i = 13, padding = "\377\177\000"}, u = {u = 13, padding2 = "\377\177\000"},
          i64 = 140733193388045, u64 = 140733193388045, d = 6.9531436082565501e-310}, o = {members = 0x7fff0000000d, size = 1048581, capacity = 1879056680}, a = {elements = 0x7fff0000000d, size = 1048581, capacity = 1879056680}}, flags_ = 32767}, value = {
      static kDefaultArrayCapacity = <optimized out>, static kDefaultObjectCapacity = <optimized out>, data_ = {s = {str = 0x300000010 <Address 0x300000010 out of bounds>, length = 4147837840, hashcode = 32767}, n = {i = {i = 16, padding = "\003\000\000"}, u = {u = 16,
            padding2 = "\003\000\000"}, i64 = 12884901904, u64 = 12884901904, d = 6.3659873808008673e-314}, o = {members = 0x300000010, size = 4147837840, capacity = 32767}, a = {elements = 0x300000010, size = 4147837840, capacity = 32767}}, flags_ = 10}}, {name = {
      static kDefaultArrayCapacity = <optimized out>, static kDefaultObjectCapacity = <optimized out>, data_ = {s = {str = 0x100005 <Address 0x100005 out of bounds>, length = 0, hashcode = 0}, n = {i = {i = 1048581, padding = "\000\000\000"}, u = {u = 1048581,
            padding2 = "\000\000\000"}, i64 = 1048581, u64 = 1048581, d = 5.1806784898186014e-318}, o = {members = 0x100005, size = 0, capacity = 0}, a = {elements = 0x100005, size = 0, capacity = 0}}, flags_ = 0}, value = {
      static kDefaultArrayCapacity = <optimized out>, static kDefaultObjectCapacity = <optimized out>, data_ = {s = {str = 0x7ffff73afc40 "config", length = 6, hashcode = 0}, n = {i = {i = -147129280, padding = "\377\177\000"}, u = {u = 4147838016,
            padding2 = "\377\177\000"}, i64 = 140737341226048, u64 = 140737341226048, d = 6.9533485386827298e-310}, o = {members = 0x7ffff73afc40, size = 6, capacity = 0}, a = {elements = 0x7ffff73afc40, size = 6, capacity = 0}}, flags_ = 1048581}}, {name = {
      static kDefaultArrayCapacity = <optimized out>, static kDefaultObjectCapacity = <optimized out>, data_ = {s = {str = 0xd00007fff <Address 0xd00007fff out of bounds>, length = 16, hashcode = 3}, n = {i = {i = 32767, padding = "\r\000\000"}, u = {u = 32767,
            padding2 = "\r\000\000"}, i64 = 55834607615, u64 = 55834607615, d = 2.7585961471597557e-313}, o = {members = 0xd00007fff, size = 16, capacity = 3}, a = {elements = 0xd00007fff, size = 16, capacity = 3}}, flags_ = 4147838023}, value = {
      static kDefaultArrayCapacity = <optimized out>, static kDefaultObjectCapacity = <optimized out>, data_ = {s = {str = 0x3 <Address 0x3 out of bounds>, length = 1048581, hashcode = 1879058612}, n = {i = {i = 3, padding = "\000\000\000"}, u = {u = 3,
            padding2 = "\000\000\000"}, i64 = 3, u64 = 3, d = 1.4821969375237396e-323}, o = {members = 0x3, size = 1048581, capacity = 1879058612}, a = {elements = 0x3, size = 1048581, capacity = 1879058612}}, flags_ = 32767}}, {name = {
      static kDefaultArrayCapacity = <optimized out>, static kDefaultObjectCapacity = <optimized out>, data_ = {s = {str = 0x300000010 <Address 0x300000010 out of bounds>, length = 1879056816, hashcode = 32767}, n = {i = {i = 16, padding = "\003\000\000"}, u = {u = 16,
            padding2 = "\003\000\000"}, i64 = 12884901904, u64 = 12884901904, d = 6.3659873808008673e-314}, o = {members = 0x300000010, size = 1879056816, capacity = 32767}, a = {elements = 0x300000010, size = 1879056816, capacity = 32767}}, flags_ = 1879056962},
    value = {static kDefaultArrayCapacity = <optimized out>, static kDefaultObjectCapacity = <optimized out>, data_ = {s = {str = 0x7fff700041b0 "", length = 0, hashcode = 0}, n = {i = {i = 1879065008, padding = "\377\177\000"}, u = {u = 1879065008,
            padding2 = "\377\177\000"}, i64 = 140735072453040, u64 = 140735072453040, d = 6.9532364464025833e-310}, o = {members = 0x7fff700041b0, size = 0, capacity = 0}, a = {elements = 0x7fff700041b0, size = 0, capacity = 0}}, flags_ = 1879065008}}, {name = {
      static kDefaultArrayCapacity = <optimized out>, static kDefaultObjectCapacity = <optimized out>, data_ = {s = {str = 0x7ffff73afc4b "dstat", length = 5, hashcode = 0}, n = {i = {i = -147129269, padding = "\377\177\000"}, u = {u = 4147838027,
            padding2 = "\377\177\000"}, i64 = 140737341226059, u64 = 140737341226059, d = 6.9533485386832733e-310}, o = {members = 0x7ffff73afc4b, size = 5, capacity = 0}, a = {elements = 0x7ffff73afc4b, size = 5, capacity = 0}}, flags_ = 1048581}, value = {
      static kDefaultArrayCapacity = <optimized out>, static kDefaultObjectCapacity = <optimized out>, data_ = {s = {str = 0x7fff70002e40 "4.", length = 12, hashcode = 16}, n = {i = {i = 1879060032, padding = "\377\177\000"}, u = {u = 1879060032,
            padding2 = "\377\177\000"}, i64 = 140735072448064, u64 = 140735072448064, d = 6.9532364461567363e-310}, o = {members = 0x7fff70002e40, size = 12, capacity = 16}, a = {elements = 0x7fff70002e40, size = 12, capacity = 16}}, flags_ = 3}}, {name = {
      static kDefaultArrayCapacity = <optimized out>, static kDefaultObjectCapacity = <optimized out>, data_ = {s = {str = 0x7fff700033c0 "timestamp", length = 9, hashcode = 32767}, n = {i = {i = 1879061440, padding = "\377\177\000"}, u = {u = 1879061440,
            padding2 = "\377\177\000"}, i64 = 140735072449472, u64 = 140735072449472, d = 6.9532364462263007e-310}, o = {members = 0x7fff700033c0, size = 9, capacity = 32767}, a = {elements = 0x7fff700033c0, size = 9, capacity = 32767}}, flags_ = 3145733}, value = {
      static kDefaultArrayCapacity = <optimized out>, static kDefaultObjectCapacity = <optimized out>, data_ = {s = {str = 0x7fff70003140 "\365\371:\367\377\177", length = 2, hashcode = 16}, n = {i = {i = 1879060800, padding = "\377\177\000"}, u = {u = 1879060800,
            padding2 = "\377\177\000"}, i64 = 140735072448832, u64 = 140735072448832, d = 6.9532364461946805e-310}, o = {members = 0x7fff70003140, size = 2, capacity = 16}, a = {elements = 0x7fff70003140, size = 2, capacity = 16}}, flags_ = 3}}}
bioothod commented 8 years ago

Having 7 members looks is much better, that should be right.

Some of them are correct, others are weird. Looks like memory corruption either in allocator or in the system. Is it reproducible?

agend commented 8 years ago

@bioothod Yes, it's reproducible every time. Also found this lines in log: ... [New Thread 0x7ffee9f4f700 (LWP 19846)] 2016-02-11 18:45:32.113287 0000000000000000/19846/19628 NOTICE: started io thread: #14, nonblocking: 1, backend: -1 ... [New Thread 0x7ffe8cab9700 (LWP 19994)] 2016-02-11 18:45:35.552400 0000000000000000/19994/19628 NOTICE: started io thread: #39, nonblocking: 0, backend: ...

bioothod commented 8 years ago

Can you start elliptics under valgrind to find out what is the reason for memory corruption&

agend commented 8 years ago

no luck with valgrind:

==20102== Command: dnet_ioserv -c /etc/elliptics/elliptics.conf
==20102==
vex amd64->IR: unhandled instruction bytes: 0xF 0x1 0xF9 0x48 0xC1 0xE2 0x20 0x48
vex amd64->IR:   REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0
vex amd64->IR:   VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=0F
vex amd64->IR:   PFX.66=0 PFX.F2=0 PFX.F3=0
==20102== valgrind: Unrecognised instruction at address 0x65d0663.
...
bioothod commented 8 years ago

Is this some kind of virtual machine?

abudnik commented 8 years ago

no luck with valgrind

Use HANDY_CLOCK_SOURCE=CLOCK_MONOTONIC environment variable

agend commented 8 years ago

HANDY_CLOCK_SOURCE=CLOCK_MONOTONIC valgrind --tool=memcheck --leak-check=yes --show-reachable=yes --num-callers=20 --track-fds=yes --log-file=elliptics_valgrind.log dnet_ioserv -c /etc/elliptics/elliptics.conf

Full elliptics_valgrind.log please find here: https://gist.github.com/agend/023a9d6185d007997cf8

Also I have build elliptics and eblob with this options in cmake elliptics: %{cmake} -DHAVE_MODULE_BACKEND_SUPPORT=no -DWITH_COCAINE=no -DWITH_STATS=no . eblob: +%{cmake} -DWITH_STATS=no .

bioothod commented 8 years ago

There is still a fair number of handystats calls in the traces, btw.

Please describe this setup in more details, hardware (is this some kind of virtual machine?), arch, number of backends, config, how long does it take to crash from the beginning, can you disable all backends and start enabling them one after another until this crash happens?

agend commented 8 years ago

Config: it fails right after start. It's virtual machine under XEN hyper visor (paravirtualized) Single backend in config. I i leave no backend in config server will start successfully.

{
        "logger": {
                "level": "debug",
                "frontends": [
                        {
                                "formatter": {
                                        "type": "string",
                                        "pattern": "%(timestamp)s %(request_id)s/%(lwp)s/%(pid)s %(severity)s: %(message)s %(...L)s"
                                },
                                "sink": {
                                        "type": "files",
                                        "path": "/dev/stdout",
                                        "autoflush": true,
                                        "rotation": {
                                                "move": 0
                                        }
                                }
                        }
                ]
        },
        "options": {
                "join": true,
                "flags": 20,
                "remote": [
                ],
                "address": [
                        "10.5.6.55:1025:2-0"
                ],
                "wait_timeout": 60,
                "check_timeout": 60,
                "io_thread_num": 2,
                "stall_count": 3,
                "nonblocking_io_thread_num": 2,
                "net_thread_num": 2,
                "daemon": false,
                "parallel": true,
                "auth_cookie": "testcluster",
                "bg_ionice_class": 3,
                "bg_ionice_prio": 0,
                "server_net_prio": 1,
                "client_net_prio": 6,
                "cache": {
                        "size": 68719476736
                },
                "indexes_shard_count": 2,
                "monitor": {
                        "port":20000
                }
        },
        "backends": [
                {
                        "backend_id": 1,
                        "type": "blob",
                        "group": 1,
                        "history": "/opt/storage/1/kdb",
                        "data": "/opt/storage/1/data",
                        "sync": "30",
                        "blob_flags": "0",
                        "blob_size": "200M",
                        "records_in_blob": "10000000",
                        "periodic_timeout": 15,
                        "read_only": false,
                        "datasort_dir": "/opt/elliptics/defrag/"
                }
        ]
}
shindo commented 8 years ago

May be it's related to the problem: https://github.com/reverbrain/eblob/blob/master/library/json_stat.cpp#L62

Here you use AddMember(const char*, Value, Allocator). This method assumes the first argument is string literal, which is not the case.

You should use AddMember(const char*, Allocator, Value, Allocator) instead that will copy the string pointed by the first argument.

shaitan commented 8 years ago

I've made https://github.com/reverbrain/eblob/pull/169 which provides this fix.

@agend, you can try to use it and check whether it fixes the issue.

agend commented 8 years ago

@shaitan no luck(

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffca1fc700 (LWP 1791)]
WriteString (this=0x7fffca1f3f30, handler=...)
    at /usr/include/handystats/rapidjson/writer.h:198
198                             const Ch c = is.Peek();
Missing separate debuginfos, use: debuginfo-install elliptics-2.26.10.1-1.oid_mod.el6.x86_64
(gdb) bt
#0  WriteString (this=0x7fffca1f3f30, handler=...)
    at /usr/include/handystats/rapidjson/writer.h:198
#1  String (this=0x7fffca1f3f30, handler=...)
    at /usr/include/handystats/rapidjson/writer.h:54
#2  rapidjson::GenericValue<rapidjson::UTF8<char>, rapidjson::MemoryPoolAllocator<rapidjson::CrtAllocator> >::Accept<rapidjson::Writer<rapidjson::GenericStringBuffer<rapidjson::UTF8<char>, rapidjson::CrtAllocator>, rapidjson::UTF8<char>, rapidjson::UTF8<char>, rapidjson::MemoryPoolAllocator<rapidjson::CrtAllocator> > > (this=0x7fffca1f3f30, handler=...)
    at /usr/include/handystats/rapidjson/document.h:530
#3  0x00007ffff73abd42 in eblob_json_commit (b=0x7fffe4001360)
    at /builds/group-d/eblob-original-id-mod/rpmbuild/BUILD/eblob-0.23.12/library/json_stat.cpp:281
#4  0x00007ffff738d6ce in eblob_periodic (b=0x7fffe4001360)
    at /builds/group-d/eblob-original-id-mod/rpmbuild/BUILD/eblob-0.23.12/library/blob.c:2902
#5  0x00007ffff738e390 in eblob_periodic_thread (data=0x7fffe4001360)
    at /builds/group-d/eblob-original-id-mod/rpmbuild/BUILD/eblob-0.23.12/library/blob.c:2867
#6  0x00007ffff6870851 in start_thread () from /lib64/libpthread.so.0
#7  0x00007ffff53cd90d in clone () from /lib64/libc.so.6
 print *data_.o.members@7
$8 = {{name = {static kDefaultArrayCapacity = <optimized out>, static kDefaultObjectCapacity = <optimized out>, data_ = {s = {str = 0x7ffff73b7c55 "global_stats", length = 12,
          hashcode = 0}, n = {i = {i = -147096491, padding = "\377\177\000"}, u = {u = 4147870805, padding2 = "\377\177\000"}, i64 = 140737341258837, u64 = 140737341258837,
          d = 6.9533485403027216e-310}, o = {members = 0x7ffff73b7c55, size = 12, capacity = 0}, a = {elements = 0x7ffff73b7c55, size = 12, capacity = 0}}, flags_ = 1048581},
    value = {static kDefaultArrayCapacity = <optimized out>, static kDefaultObjectCapacity = <optimized out>, data_ = {s = {
          str = 0xc00007fff <Address 0xc00007fff out of bounds>, length = 16, hashcode = 3}, n = {i = {i = 32767, padding = "\f\000\000"}, u = {u = 32767,
            padding2 = "\f\000\000"}, i64 = 51539640319, u64 = 51539640319, d = 2.5463965680632285e-313}, o = {members = 0xc00007fff, size = 16, capacity = 3}, a = {
          elements = 0xc00007fff, size = 16, capacity = 3}}, flags_ = 4147870818}}, {name = {static kDefaultArrayCapacity = <optimized out>,
      static kDefaultObjectCapacity = <optimized out>, data_ = {s = {str = 0x7fff0000000d <Address 0x7fff0000000d out of bounds>, length = 1048581, hashcode = 1610618296},
        n = {i = {i = 13, padding = "\377\177\000"}, u = {u = 13, padding2 = "\377\177\000"}, i64 = 140733193388045, u64 = 140733193388045, d = 6.9531436082565501e-310}, o = {
          members = 0x7fff0000000d, size = 1048581, capacity = 1610618296}, a = {elements = 0x7fff0000000d, size = 1048581, capacity = 1610618296}}, flags_ = 32767}, value = {
      static kDefaultArrayCapacity = <optimized out>, static kDefaultObjectCapacity = <optimized out>, data_ = {s = {str = 0x300000010 <Address 0x300000010 out of bounds>,
          length = 4147870832, hashcode = 32767}, n = {i = {i = 16, padding = "\003\000\000"}, u = {u = 16, padding2 = "\003\000\000"}, i64 = 12884901904, u64 = 12884901904,
          d = 6.3659873808008673e-314}, o = {members = 0x300000010, size = 4147870832, capacity = 32767}, a = {elements = 0x300000010, size = 4147870832, capacity = 32767}},
      flags_ = 10}}, {name = {static kDefaultArrayCapacity = <optimized out>, static kDefaultObjectCapacity = <optimized out>, data_ = {s = {
          str = 0x100005 <Address 0x100005 out of bounds>, length = 0, hashcode = 0}, n = {i = {i = 1048581, padding = "\000\000\000"}, u = {u = 1048581,
            padding2 = "\000\000\000"}, i64 = 1048581, u64 = 1048581, d = 5.1806784898186014e-318}, o = {members = 0x100005, size = 0, capacity = 0}, a = {elements = 0x100005,
          size = 0, capacity = 0}}, flags_ = 0}, value = {static kDefaultArrayCapacity = <optimized out>, static kDefaultObjectCapacity = <optimized out>, data_ = {s = {
          str = 0x7ffff73b7d20 "config", length = 6, hashcode = 0}, n = {i = {i = -147096288, padding = "\377\177\000"}, u = {u = 4147871008, padding2 = "\377\177\000"},
          i64 = 140737341259040, u64 = 140737341259040, d = 6.9533485403127512e-310}, o = {members = 0x7ffff73b7d20, size = 6, capacity = 0}, a = {elements = 0x7ffff73b7d20,
          size = 6, capacity = 0}}, flags_ = 1048581}}, {name = {static kDefaultArrayCapacity = <optimized out>, static kDefaultObjectCapacity = <optimized out>, data_ = {s = {
          str = 0xd00007fff <Address 0xd00007fff out of bounds>, length = 16, hashcode = 3}, n = {i = {i = 32767, padding = "\r\000\000"}, u = {u = 32767,
            padding2 = "\r\000\000"}, i64 = 55834607615, u64 = 55834607615, d = 2.7585961471597557e-313}, o = {members = 0xd00007fff, size = 16, capacity = 3}, a = {
          elements = 0xd00007fff, size = 16, capacity = 3}}, flags_ = 0}, value = {static kDefaultArrayCapacity = <optimized out>,
      static kDefaultObjectCapacity = <optimized out>, data_ = {s = {str = 0x0, length = 0, hashcode = 0}, n = {i = {i = 0, padding = "\000\000\000"}, u = {u = 0,
            padding2 = "\000\000\000"}, i64 = 0, u64 = 0, d = 0}, o = {members = 0x0, size = 0, capacity = 0}, a = {elements = 0x0, size = 0, capacity = 0}}, flags_ = 0}}, {
    name = {static kDefaultArrayCapacity = <optimized out>, static kDefaultObjectCapacity = <optimized out>, data_ = {s = {str = 0x7ffff73b7d27 "vfs", length = 4147871019,
          hashcode = 32767}, n = {i = {i = -147096281, padding = "\377\177\000"}, u = {u = 4147871015, padding2 = "\377\177\000"}, i64 = 140737341259047,
          u64 = 140737341259047, d = 6.953348540313097e-310}, o = {members = 0x7ffff73b7d27, size = 4147871019, capacity = 32767}, a = {elements = 0x7ffff73b7d27,
          size = 4147871019, capacity = 32767}}, flags_ = 5}, value = {static kDefaultArrayCapacity = <optimized out>, static kDefaultObjectCapacity = <optimized out>,
      data_ = {s = {str = 0x600022d000100005 <Address 0x600022d000100005 out of bounds>, length = 32767, hashcode = 12}, n = {i = {i = 1048581, padding = "\320\"\000`"}, u = {
            u = 1048581, padding2 = "\320\"\000`"}, i64 = 6917567304390672389, u64 = 6917567304390672389, d = 2.704352568720349e+154}, o = {members = 0x600022d000100005,
          size = 32767, capacity = 12}, a = {elements = 0x600022d000100005, size = 32767, capacity = 12}}, flags_ = 16}}, {name = {
      static kDefaultArrayCapacity = <optimized out>, static kDefaultObjectCapacity = <optimized out>, data_ = {s = {str = 0x0, length = 0, hashcode = 0}, n = {i = {i = 0,
            padding = "\000\000\000"}, u = {u = 0, padding2 = "\000\000\000"}, i64 = 0, u64 = 0, d = 0}, o = {members = 0x0, size = 0, capacity = 0}, a = {elements = 0x0,
          size = 0, capacity = 0}}, flags_ = 0}, value = {static kDefaultArrayCapacity = <optimized out>, static kDefaultObjectCapacity = <optimized out>, data_ = {s = {
          str = 0x0, length = 0, hashcode = 0}, n = {i = {i = 0, padding = "\000\000\000"}, u = {u = 0, padding2 = "\000\000\000"}, i64 = 0, u64 = 0, d = 0}, o = {
          members = 0x0, size = 0, capacity = 0}, a = {elements = 0x0, size = 0, capacity = 0}}, flags_ = 0}}, {name = {static kDefaultArrayCapacity = <optimized out>,
      static kDefaultObjectCapacity = <optimized out>, data_ = {s = {str = 0x7fff60002850 "timestamp", length = 9, hashcode = 0}, n = {i = {i = 1610623056,
            padding = "\377\177\000"}, u = {u = 1610623056, padding2 = "\377\177\000"}, i64 = 140734804011088, u64 = 140734804011088, d = 6.9532231836079447e-310}, o = {
          members = 0x7fff60002850, size = 9, capacity = 0}, a = {elements = 0x7fff60002850, size = 9, capacity = 0}}, flags_ = 3145733}, value = {
      static kDefaultArrayCapacity = <optimized out>, static kDefaultObjectCapacity = <optimized out>, data_ = {s = {str = 0x7fff600025d0 "\325z;\367\377\177", length = 2,
          hashcode = 16}, n = {i = {i = 1610622416, padding = "\377\177\000"}, u = {u = 1610622416, padding2 = "\377\177\000"}, i64 = 140734804010448, u64 = 140734804010448,
          d = 6.9532231835763245e-310}, o = {members = 0x7fff600025d0, size = 2, capacity = 16}, a = {elements = 0x7fff600025d0, size = 2, capacity = 16}}, flags_ = 3}}}
shaitan commented 8 years ago

Could you give us backtraces of all threads?

agend commented 8 years ago

Here you go: https://gist.github.com/agend/a5ec86413aef2b09c371

shaitan commented 8 years ago

Could you remember what preceded the first crash? Was it updating elliptics/eblob? If it was, could you check that previous version does not crash in such way?

agend commented 8 years ago

@shaitan it wan't an update. We have installed it on fresh new server. Also we have tested it on CenOs7 and this issue hasn't reproduced . Did you try it on Cenos6?

agend commented 8 years ago

@bioothod Do you plan to support CentOS 6?

bioothod commented 8 years ago

I'm not sure that supporting that old compiler worth efforts, especially since we do not have systems to run all the tests. If it has weird crash in the area which was not changed for years I'm pretty sure there will be more bugs like that.

Please upgrade to the newer centos and compiler, this is the simplest way forward I believe.