plasma-umass / Mesh

A memory allocator that automatically reduces the memory footprint of C/C++ applications.
Apache License 2.0
1.75k stars 73 forks source link

Segfault when starting Firefox #25

Closed jc00ke closed 5 years ago

jc00ke commented 5 years ago

image

Using 667bb69 compiled on Manjaro with gcc 8.2.1 20181127. Any other details I can provide that would be helpful?

 ~/src > git clone --recurse-submodules https://github.com/plasma-umass/mesh                                                                   Tue 19 Feb 2019 10:01:51 AM PST
Cloning into 'mesh'...
remote: Enumerating objects: 93, done.
remote: Counting objects: 100% (93/93), done.
remote: Compressing objects: 100% (76/76), done.
remote: Total 5119 (delta 33), reused 39 (delta 14), pack-reused 5026
Receiving objects: 100% (5119/5119), 5.12 MiB | 9.78 MiB/s, done.
Resolving deltas: 100% (3483/3483), done.
Submodule 'Heap-Layers' (https://github.com/emeryberger/Heap-Layers) registered for path 'src/vendor/Heap-Layers'
Submodule 'src/vendor/googletest' (https://github.com/google/googletest.git) registered for path 'src/vendor/googletest'
Cloning into '/home/jesse/src/mesh/src/vendor/Heap-Layers'...
remote: Enumerating objects: 14, done.
remote: Counting objects: 100% (14/14), done.
remote: Compressing objects: 100% (13/13), done.
remote: Total 1706 (delta 4), reused 7 (delta 1), pack-reused 1692
Receiving objects: 100% (1706/1706), 405.11 KiB | 5.33 MiB/s, done.
Resolving deltas: 100% (1117/1117), done.
Cloning into '/home/jesse/src/mesh/src/vendor/googletest'...
remote: Enumerating objects: 2, done.
remote: Counting objects: 100% (2/2), done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 16274 (delta 0), reused 0 (delta 0), pack-reused 16272
Receiving objects: 100% (16274/16274), 5.65 MiB | 6.85 MiB/s, done.
Resolving deltas: 100% (11985/11985), done.
Submodule path 'src/vendor/Heap-Layers': checked out 'af8961599772d1f33ac33178f86fd92cd67e8cf0'
Submodule path 'src/vendor/googletest': checked out '529c2c6f4af29dadb8ee5cddf6a7919caa5ca5f6'

 ~/src > cd mesh/                                                                                                                     3259ms  Tue 19 Feb 2019 10:01:55 AM PST
 ~/s/mesh > ./configure                                                                                                                    Tue 19 Feb 2019 10:01:57 AM PST
 ~/s/mesh > make                                                                                                                           Tue 19 Feb 2019 10:01:58 AM PST
  CXX   build/src/unit/bitmap_test.o
  CXX   build/src/unit/mesh_test.o
  CXX   build/src/unit/alignment.o
  CXX   build/src/unit/binned_tracker_test.o
  CXX   build/src/unit/triple_mesh_test.o
  CXX   build/src/unit/rng_test.o
  CXX   build/src/unit/concurrent_mesh_test.o
  CXX   build/src/unit/size_class_test.o
  CXX   build/src/vendor/googletest/googletest/src/gtest-all.o
  CXX   build/src/vendor/googletest/googletest/src/gtest_main.o
  CXX   build/src/thread_local_heap.o
  CXX   build/src/global_heap.o
  CXX   build/src/runtime.o
  CXX   build/src/real.o
  CXX   build/src/meshable_arena.o
  CXX   build/src/d_assert.o
  CXX   build/src/measure_rss.o
  LD    unit.test

Running main() from src/vendor/googletest/googletest/src/gtest_main.cc
[==========] Running 22 tests from 8 test cases.
[----------] Global test environment set-up.
[----------] 3 tests from SizeClass
[ RUN      ] SizeClass.MinObjectSize
[       OK ] SizeClass.MinObjectSize (0 ms)
[ RUN      ] SizeClass.SmallClasses
[       OK ] SizeClass.SmallClasses (0 ms)
[ RUN      ] SizeClass.PowerOfTwo
[       OK ] SizeClass.PowerOfTwo (0 ms)
[----------] 3 tests from SizeClass (0 ms total)

[----------] 2 tests from ConcurrentMeshTest
[ RUN      ] ConcurrentMeshTest.TryMesh
[       OK ] ConcurrentMeshTest.TryMesh (1 ms)
[ RUN      ] ConcurrentMeshTest.TryMeshInverse
[       OK ] ConcurrentMeshTest.TryMeshInverse (1 ms)
[----------] 2 tests from ConcurrentMeshTest (3 ms total)

[----------] 1 test from RNG
[ RUN      ] RNG.MWCRange
[       OK ] RNG.MWCRange (0 ms)
[----------] 1 test from RNG (0 ms total)

[----------] 1 test from TripleMeshTest
[ RUN      ] TripleMeshTest.MeshAll
[       OK ] TripleMeshTest.MeshAll (29 ms)
[----------] 1 test from TripleMeshTest (29 ms total)

[----------] 1 test from BinnedTracker
[ RUN      ] BinnedTracker.Tests
[       OK ] BinnedTracker.Tests (0 ms)
[----------] 1 test from BinnedTracker (0 ms total)
[----------] 1 test from Alignment
[ RUN      ] Alignment.NaturalAlignment
[       OK ] Alignment.NaturalAlignment (442 ms)
[----------] 1 test from Alignment (442 ms total)

[----------] 2 tests from MeshTest
[ RUN      ] MeshTest.TryMesh
[       OK ] MeshTest.TryMesh (1 ms)
[ RUN      ] MeshTest.TryMeshInverse
[       OK ] MeshTest.TryMeshInverse (1 ms)
[----------] 2 tests from MeshTest (2 ms total)

[----------] 11 tests from BitmapTest
[ RUN      ] BitmapTest.RepresentationSize
[       OK ] BitmapTest.RepresentationSize (0 ms)
[ RUN      ] BitmapTest.LowestSetBitAt
[       OK ] BitmapTest.LowestSetBitAt (0 ms)
[ RUN      ] BitmapTest.HighestSetBitAt
[       OK ] BitmapTest.HighestSetBitAt (0 ms)
[ RUN      ] BitmapTest.SetAndExchangeAll
[       OK ] BitmapTest.SetAndExchangeAll (0 ms)
[ RUN      ] BitmapTest.SetAll
[       OK ] BitmapTest.SetAll (0 ms)
[ RUN      ] BitmapTest.SetGet
[       OK ] BitmapTest.SetGet (16 ms)
[ RUN      ] BitmapTest.SetGetRelaxed
[       OK ] BitmapTest.SetGetRelaxed (222 ms)
[ RUN      ] BitmapTest.Builtins
[       OK ] BitmapTest.Builtins (0 ms)
[ RUN      ] BitmapTest.Iter
[       OK ] BitmapTest.Iter (0 ms)
[ RUN      ] BitmapTest.Iter2
[       OK ] BitmapTest.Iter2 (0 ms)
[ RUN      ] BitmapTest.SetHalf
[       OK ] BitmapTest.SetHalf (1 ms)
[----------] 11 tests from BitmapTest (239 ms total)

[----------] Global test environment tear-down
[==========] 22 tests from 8 test cases ran. (715 ms total)
[  PASSED  ] 22 tests.
  CXX   build/src/libmesh.o
  LD    libmesh.so
  CXX   build/src/fragmenter.o
  LD    fragmenter
 ~/s/mesh > sudo make install                                                                                                      53.5s  Tue 19 Feb 2019 10:02:54 AM PST
[sudo] password for jesse:
 ~/s/mesh > env LD_PRELOAD=libmesh.so git status                                                                                  2684ms > Tue 19 Feb 2019 10:05:28 AM PST
On branch master
Your branch is up to date with 'origin/master'.

nothing to commit, working tree clean

 ~ > env LD_PRELOAD=libmesh.so firefox
segfault (1/0x28): in arena? 0
ExceptionHandler::GenerateDump cloned child 24873
ExceptionHandler::SendContinueSignalToChild sent continue signal to child
ExceptionHandler::WaitForContinueSignal waiting for continue signal...
2019-02-19 10:07:36: minidump.cc:1926: ERROR: MinidumpModule has a module problem, 0x
2019-02-19 10:07:36: minidump.cc:2740: ERROR: MinidumpModuleList could not read modul
2019-02-19 10:07:36: minidump.cc:5895: ERROR: GetStream could not read stream type 4
fish: “env LD_PRELOAD=libmesh.so firef…” terminated by signal SIGABRT (Abort)

I also tried running Chromium and got a segfault as well:

∴ LD_PRELOAD=libmesh.so chromium
../../third_party/tcmalloc/gperftools-2.0/chromium/src/tcmalloc.cc:289] Attempt to free invalid pointer 0x7f07e8ef6dc0
segfault (1/0x39): in arena? 0
^C^CAborted (core dumped)
bobby-stripe commented 5 years ago

thanks for reporting -- I'm pretty sure what is happening here is that both chrome + firefox bundle their own allocators by default, and overriding malloc/free causes a pointer to be allocated from mesh and e.g. returned to tcmalloc (or the other way around).

I wonder if we can detect this and bail early, maybe if the RTLD_NEXT malloc symbol isn't from libc?

jc00ke commented 5 years ago

Interesting. Maybe I missed it in the paper, but did you compile Firefox with mesh as the allocator? I was under the impression that the LD_PRELOAD env var would swap out allocators.

What I'm really hoping for is LD_PRELOAD=libmesh.so slack :wink:

bobby-stripe commented 5 years ago

@jc00ke we compiled firefox with the --no-jemalloc option; otherwise firefox (for performance reasons) uses jemalloc in such a way that it is non-overridable via LD_PRELOAD. Slack is going to have a similar problem I suspect (as Chromium defaults to a similar thing)

jc00ke commented 5 years ago

@bobby-stripe gotcha, thanks for the clarification. I guess it makes sense to close this for now.