wangxiaowei0303 / rapidjson

Automatically exported from code.google.com/p/rapidjson
MIT License
0 stars 0 forks source link

Segmentation fault - document.h line 618 #11

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
if (name[member->name.data_.s.length] == '\0' && 
memcmp(member->name.data_.s.str, name, member->name.data_.s.length * 
sizeof(Ch)) == 0)
                return member;
}

This code in document.h line 618 generate segmentation fault on the ARM
architecture.

Problem in the member->name.data_.s.str but I don't know why.
CString at member->name.data_.s.str exist.

Original issue reported on code.google.com by archj...@gmail.com on 31 Jan 2012 at 2:09

GoogleCodeExporter commented 9 years ago
any attempt access to the member->name.data generate segmentation fault

Original comment by archj...@gmail.com on 31 Jan 2012 at 3:02

GoogleCodeExporter commented 9 years ago
From e-mail:

const char* json = "{\"message_id\": 3,\"uid\": 55367}"

Parsing code :
// (IwAssert(PARSER, ...) is assert macro)
Document document;

IwAssert(PARSER, !document.Parse<0>(json).HasParseError());
IwAssert(PARSER, document.IsObject());

/* Get Data */
IwAssert(PARSER, (document["message_id"].IsUint())); // error here

I check in the debugger:
name  - contain "message_id" with zero at end
member->name.data_.s.str - contain "message_id" with zero at end
member->name.data_.s.length - equal to 10
sizeof(Ch) - equal to 1

but any attempt access to the member->name.data generate segmentation fault.

Original comment by milo...@gmail.com on 31 Jan 2012 at 3:16

Attachments:

GoogleCodeExporter commented 9 years ago
in document.h :618 

name[member->name.data_.s.length] == '\0' <-- might access out of boundary when 
the "name[]" is shorter than any member

reverse the order of memcmp and the zero tail check might be better
From:
name[member->name.data_.s.length] == '\0' && memcmp(member->name.data_.s.str, 
name, member->name.data_.s.length * sizeof(Ch)) == 0
To:
memcmp(member->name.data_.s.str, name, member->name.data_.s.length * 
sizeof(Ch)) == 0
&& name[member->name.data_.s.length] == '\0'

Original comment by jasoncha...@gmail.com on 31 Jan 2012 at 6:18

GoogleCodeExporter commented 9 years ago
The cause of the problem in that platform is unaligned memory access.

It is found that, it is due to MemoryPoolAllocator::Malloc()/Realloc() did not 
ensure 4-byte alignment in the return value.

Original comment by milo...@gmail.com on 31 Jan 2012 at 6:20

GoogleCodeExporter commented 9 years ago
@jasonchan35

It seems reversing the order will still possible to access out of boundary.

I think that a safer approach should first compute strlen(name) and compares it 
with the member's, and then do memcmp().

But thank you for pointing out that.

Original comment by milo...@gmail.com on 31 Jan 2012 at 6:26

GoogleCodeExporter commented 9 years ago
Let me start by saying that this library is absolutely beautifully written!  I 
really want to use this library.

A similar problem occurs in document.h with ARM architecture here:

    //! Assignment without calling destructor
    void RawAssign(GenericValue& rhs) {
        memcpy(this, &rhs, sizeof(GenericValue)); // <<-- here
        rhs.flags_ = kNullFlag;
    }

I am running tutorial.cpp on ARM v7 little Endian.  Using a debugger, I find 
the following:

this and rhs are instances of GenericDocument, which are derived from 
GenericValue.  sizeof(GenericValue) is 24; the data portion of GenericValue all 
fits into 20, but there is 4 bytes of padding to align on 8 byte boundaries, 
apparently.  So, sizeof operator gives the size when an instance of 
GenericValue is created.

When the actual instance is a derived class, the derived portion does not start 
at 24, but at 20, which is the end of the data portion of GenericValue.  That 
is, the 4 bytes padding is not used in this case.  The result is that memcpy 
clobbers the pointer "allocator_" at the beginning of GenericDocument's data.

Also, there is comment like this in document.h for the struct Data:

    // 12 bytes in 32-bit mode, 16 bytes in 64-bit mode

This is apparently not true for ARM.  It is 32-bit, but the sizeof in memory is 
padded to 16 bytes.  Also, even though the struct is a value type within an 
instance of GenericValue, the struct is still padded.  This is, I guess, 
controlled by specification, i.e., sizeof gives the size within the instance.  
This is a different situation from the inheritance, where the padding is 
dropped for GenericValue when appending the derived class's data portion.  C++ 
object layout is not controlled by specification.

In general, since the layout of objects is not controlled by specification, 
would it not be better to avoid low level memory operations on C++ instances?

Original comment by jabhodg...@gmail.com on 8 Feb 2012 at 10:05

GoogleCodeExporter commented 9 years ago
This issue was closed by revision r53.

Original comment by milo...@gmail.com on 19 Feb 2012 at 2:30

GoogleCodeExporter commented 9 years ago
Hi Milo,

Again, a great library.  I really like it.

Regarding issue 11, I am still having problems.  It still segfaults, owing
apparently to assumptions about alignment of derived class GenericDocument
with respect to its base class, GenericValue.

Stack trace:

rapidjson (1) [BlackBerry Tablet OS Postmortem Debugging]
    QNX GDB Debugger (12-02-21 10:25 AM) (Suspended)
        Thread [1] (Suspended)
            5
rapidjson::MemoryPoolAllocator<rapidjson::CrtAllocator>::Realloc()
/home/johodgson/rapidjson/include/rapidjson/allocators.h:171 0x0010504c
            4 rapidjson::GenericValue<rapidjson::UTF8<char>,
rapidjson::MemoryPoolAllocator<rapidjson::CrtAllocator> >::Reserve()
/home/johodgson/rapidjson/include/rapidjson/document.h:377 0x00104e58
            3 rapidjson::GenericValue<rapidjson::UTF8<char>,
rapidjson::MemoryPoolAllocator<rapidjson::CrtAllocator> >::PushBack()
/home/johodgson/rapidjson/include/rapidjson/document.h:393 0x00103a20
            2 rapidjson::GenericValue<rapidjson::UTF8<char>,
rapidjson::MemoryPoolAllocator<rapidjson::CrtAllocator> >::PushBack<int>()
/home/johodgson/rapidjson/include/rapidjson/document.h:401 0x00102a64
            1 main()
/home/johodgson/rapidjson/example/tutorial/tutorial.cpp:88 0x001016cc
    /opt/bbndk-2.0.0/host/linux/x86/usr/bin/ntoarm-gdb (12-02-21 10:25
AM)
    <terminated, exit value:
0>/home/johodgson/rapidjson/Device-Debug/rapidjson (12-02-21 10:25 AM)

Variables:

this    0x00000000
originalPtr    0x0019042c
originalSize    96
newSize    192
__func__    0x0010b7a4
newBuffer    0x001905c4

Allocator (this) is NULL, because it gets clobbered with a memcpy using
sizeof(GenericValue) (=24 bytes).  I checked it out using View Memory in my
QNX Momentics IDE.  Layout of derived class GenericDocument object is
immediately following GenericValue (starting at byte 20), without the extra
four bytes of padding at the end of GenericValue.  The compiler puts in the
extra padding (I don't know why), which is what brings it out to the sizeof
24 bytes, but the compiler does not use this padding when laying out the
derived class.

My compiler is QCC:

qcc -o example/tutorial/tutorial.o ../example/tutorial/tutorial.cpp
-V4.4.2,gcc_ntoarmv7le_cpp -w1
-I/opt/bbndk-2.0.0/target/qnx6/../target-override/usr/include
-I/home/johodgson/rapidjson/include
-I/home/johodgson/rapidjson/thirdparty/gtest/include -D_FORTIFY_SOURCE=2 -c
-g -fstack-protector-all
qcc -o rapidjson example/tutorial/tutorial.o -V4.4.2,gcc_ntoarmv7le_cpp -w1
-lang-c++ -g -Wl,-z,relro -Wl,-z,now
-L/opt/bbndk-2.0.0/target/qnx6/../target-override/armle-v7/lib
-L/opt/bbndk-2.0.0/target/qnx6/../target-override/armle-v7/usr/lib

This change at the bottom of document.h fixes it for me, because it simply
moves GenericDocument data farther down, out of the range of clobbering.

     static const size_t kDefaultStackCapacity = 1024;
+    char padding_[4];
     internal::Stack<Allocator> stack_;
     const char* parseError_;
     size_t errorOffset_;

I think that this is a safe change for 32-bit architectures, but may not
work on a 64-bit architecture (may need to add more padding, e.g., 8 bytes).

I'm working on an ARM v7 on RIM PlayBook, which is 32-bit.  The new ARM v8
will be 64-bit, so stuff compiled as 64-bit may not work.

Your library is a really nice template approach.  The inheritance
GenericValue <- GenericDocument may be out-of-character.  Maybe there is a
way to move out the document-building code and avoid inheritance.  That may
be a better way of fixing the issue.

Please don't hesitate to contact me if I may be of any assistance, e.g., in
testing.

Kind regards, John.

Original comment by jabhodg...@gmail.com on 21 Feb 2012 at 4:16

GoogleCodeExporter commented 9 years ago

Original comment by milo...@gmail.com on 22 Feb 2012 at 8:08

GoogleCodeExporter commented 9 years ago
Issue 43 has been merged into this issue.

Original comment by milo...@gmail.com on 14 Nov 2012 at 3:49

GoogleCodeExporter commented 9 years ago
I also try to use rapidjson on an arm platform.

The latest version 0.11 (rev98) still has the same alignment problem on ARM. 
But John's workaround works. But I think Jonh's workaround is just a quick fix. 
I'd be happy to test any official patches.

Original comment by damjan.l...@gmail.com on 11 Jan 2013 at 1:38

GoogleCodeExporter commented 9 years ago
Confirm the issue still exists with latest version and John's workaround works 
as well. Tested on HTC Droid DNA.

Original comment by freesam...@gmail.com on 12 Mar 2013 at 3:38

GoogleCodeExporter commented 9 years ago
I have pushed a fix for this issue to my GitHub fork at
https://github.com/pah/rapidjson/commit/7475a969

Original comment by philipp....@gmail.com on 1 Feb 2014 at 6:46

GoogleCodeExporter commented 9 years ago
https://github.com/miloyip/rapidjson/issues/8

Original comment by milo...@gmail.com on 20 Jun 2014 at 8:25