gcode-mirror / gperftools

Automatically exported from code.google.com/p/gperftools
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Tcmalloc crashes when process adds an mmap block close to the top of the heap #688

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
I consistently see this random crash. My app does mmap with randomized pointer 
value. I narrowed down the cause of the crash to the memory range of 
0x01000000-0x05000000. When mmap is called in the middle of the process 
allocating few megabytes from this address range, crash follows soon on some 
other allocation of ~32kb. Mmap for this specific address 0x43f0000 makes crash 
repeatable. Crash occurs in src/thread_cache.h on invalid pointer list_ (it 
doesn't point to the accessible memory).

TCMALLOC_SKIP_SBRK=1 eliminates the crash.

It is of course possible that my app that causes some memory corruption. But 
there are no other indications of anything strange going on, application works 
long term, memory goes up to 10GB and down to 500MB, and never crashes.

Original issue reported on code.google.com by yuriv...@gmail.com on 11 May 2015 at 10:26

GoogleCodeExporter commented 9 years ago
In order to make any further progress on this we'll need at least some evidence.

Ideally, small program the reproduces the crash.

Right now, unfortunately, there is not enough information for me to help.

Original comment by alkondratenko on 17 May 2015 at 6:57

GoogleCodeExporter commented 9 years ago
I know, I couldn't reproduce it myself with the small program. But moving away 
from the heap was the factor that eliminated the crash for sure.

The way how I thought would be easiest to reproduce is for testing purposes to 
add and leave such mmap in tcmalloc itself, and then run various programs with 
it. 

Original comment by yuriv...@gmail.com on 21 May 2015 at 7:57

GoogleCodeExporter commented 9 years ago
Yes, I could reproduce it with qbittorrent process, with patched tcmalloc, see 
the attached patch.
The exact value of MMAP may vary depending on sizes of executable/libs on your 
system.

$ XXXMMAPXXX=$((0x1000000)) LD_PRELOAD=/usr/local/lib/libtcmalloc_minimal.so 
/usr/local/bin/qbittorrent
got XXXMMAPXXX->0x1000000
Segmentation fault

It still crashes with 0x1200000 and 0x1300000, and doesn't with 0x1400000 and 
higher.

google-perftools-2.4 on FreeBSD 10.1 amd64

Original comment by yuriv...@gmail.com on 21 May 2015 at 11:27

Attachments:

GoogleCodeExporter commented 9 years ago
Thanks for update.

But please note that I don't have freebsd instance readily available to me. Can 
I get at least backtraces ?

Original comment by alkondratenko on 21 May 2015 at 11:29

GoogleCodeExporter commented 9 years ago
backtrace doesn't seem relevant:
(gdb) bt
#0  0x0000000803e22bfe in ?? () from /lib/libthr.so.3
#1  0x0000000804154d7c in ?? () from /lib/libc.so.7
#2  0x0000000804154ed4 in __cxa_atexit () from /lib/libc.so.7
#3  0x00000008031bd9e8 in ?? () from /usr/local/lib/qt4/libQtCore.so.4
#4  0x00000008031c03e0 in QTextCodec::codecForLocale() () from 
/usr/local/lib/qt4/libQtCore.so.4
#5  0x00000008030dc7b2 in QString::toLocal8Bit() const () from 
/usr/local/lib/qt4/libQtCore.so.4
#6  0x000000080311a87e in ?? () from /usr/local/lib/qt4/libQtCore.so.4
#7  0x000000080311b22f in QFile::encodeName(QString const&) () from 
/usr/local/lib/qt4/libQtCore.so.4
#8  0x00000008031692d5 in ?? () from /usr/local/lib/qt4/libQtCore.so.4
#9  0x0000000803128fed in QProcess::start(QString const&, QStringList const&, 
QFlags<QIODevice::OpenModeFlag>) () from /usr/local/lib/qt4/libQtCore.so.4
#10 0x00000000004d3b2d in misc::pythonVersion() ()

It is a result of some memory corruption that occurred earlier. 

Original comment by yuriv...@gmail.com on 21 May 2015 at 11:52

GoogleCodeExporter commented 9 years ago
You can easily install FreeBSD in VM. But you might be able to reproduce it on 
linux too.

Original comment by yuriv...@gmail.com on 21 May 2015 at 11:55

GoogleCodeExporter commented 9 years ago
Sure I can. But keep in mind, that time I have for this project is limited. And 
currently there are more important sub-projects that I'm working on. So I 
cannot promise you fast turn-around on this ticket.

Original comment by alkondratenko on 23 May 2015 at 6:34

GoogleCodeExporter commented 9 years ago
No problem! I am not dependent on this, once I have the workaround.

Original comment by yuriv...@gmail.com on 24 May 2015 at 10:19

GoogleCodeExporter commented 9 years ago
Somebody will need to take deeper look at this. For now I'm going back to other 
subprojects.

So I've checked and sbrk on both FreeBSD and GNU/Linux (and I've even tried 
GNU/kFreeBSD) actually returns ENOMEM when "break" tries to cross mmap-ed area.

I'm inclined to think that maybe your program is requesting MMAP_FIXED area and 
is not checking for error and ends up overwriting heap. Is that possible ?

Here is my test program btw: http://paste.debian.net/288232/

Original comment by alkondratenko on 1 Aug 2015 at 7:29

GoogleCodeExporter commented 9 years ago
Yes, I used MAP_FIXED, but I certainly checked the error code returned by mmap.

Also the patch I attached before simulates the failure with unrelated app 
qbittorrent when it adds the similar mmapped area.

Original comment by yuriv...@gmail.com on 2 Aug 2015 at 6:48

GoogleCodeExporter commented 9 years ago
I keep forgetting about weirdness of MAP_FIXED.

Quoting from manpage:

MAP_FIXED
              Don't interpret addr as a hint: place the mapping at exactly
              that address.  addr must be a multiple of the page size.  If
              the memory region specified by addr and len overlaps pages of
              any existing mapping(s), then the overlapped part of the
              existing mapping(s) will be discarded.  If the specified
              address cannot be used, mmap() will fail.  Because requiring a
              fixed address for a mapping is less portable, the use of this
              option is discouraged.

Really important page is "overlapped part of existing mapping(S) will be 
discarded".

So MAP_FIXED will corrupt the sbrk-ful part of heap without returning error.

So there is apparently nothing to fix in tcmalloc itself.

Original comment by alkondratenko on 2 Aug 2015 at 6:54

GoogleCodeExporter commented 9 years ago
I see. So this would explain the problem. I didn't know about this aspect of 
MAP_FIXED before.

For the record, on BSD there is an additional option MAP_EXCL:

                        In contrast, if MAP_EXCL is specified, the request
                        will fail if a mapping already exists within the
                        range.

So MAP_FIXED is dangerous in this way, but MAP_FIXED|MAP_EXCL isn't.

Close this bug then.

Thank you for finding an explanation!

Original comment by yuriv...@gmail.com on 3 Aug 2015 at 9:22