Closed springmeyer closed 7 years ago
further idea: write a function that uses v8 objects and does not have a handlescope and leaks therefore. Ensure that the leak checker run on travis catches the leak.
This ticket is blue sky and most valuable for thinking about what is possible. However @mapsam over in https://github.com/mapbox/hpp-skel/pull/17#issuecomment-319199473 just did a quick test in hpp-skel to add a bug in a throwaway branch to confirm manually that the sanitizers are working. We should do the same for node-cpp-skel here, just to quickly confirm they are working. @mapsam want to pair with @GretaCB on this in a free moment?
@springmeyer I created a throwaway branch and removed the Handlescopes to trigger undefined behaviour within the threadpool. Looks like Travis successfully failed with the undefined sanitizer π
Looks like Travis successfully failed with the undefined sanitizer π
π Per chat, this proves that things are working. Which is great. So, going forward we'll be confident that if we make a coding mistake in node-cpp-skel (or a project based on it) we should have help from the address and undefined sanitizer.
Note: The error that was thrown in your branch actually is one that was suppressed in the master branch. It is coming from the vptr
part of the undefined
behavior sanitizer and arose when you added -faddress=undefined
since that overrode the -fno-sanitize=vptr,function
at https://github.com/mapbox/node-cpp-skel/blob/72092b408dae95f12d067395e42176eb601ed8fb/scripts/setup.sh#L94. So, this still proves things are working, but did not actually detect a problem due to the code changes (only your .travis.yml change: https://github.com/mapbox/node-cpp-skel/compare/sanitize#diff-354f30a63fb0907d4ad57269548329e3R79).
@GretaCB - noticed your last commit definitely triggered the use-after-free error, as we hoped! π
The sanitizer output is verbose, but the key thing to note is the type of error heap-use-after-free
and the line it was encountered: hello_async.cpp:130:17
. That means line 130
and 17
characters in, which points exactly to the bug x[5];
at https://github.com/mapbox/node-cpp-skel/compare/663df40bb5ee...4df31d7a9ec5#diff-032fe0b90870cc3a55a6326685d9adbcR130
=================================================================
==4355==ERROR: AddressSanitizer: heap-use-after-free on address 0x60700000a825 at pc 0x7fb32b9d0926 bp 0x7fb329d38b50 sp 0x7fb329d38b48
READ of size 1 at 0x60700000a825 thread T8
#0 0x7fb32b9d0925 in object_async::do_expensive_work(bool, std::string const&) /home/travis/build/mapbox/node-cpp-skel/build/../src/object_async/hello_async.cpp:130:17
at https://travis-ci.org/mapbox/node-cpp-skel/jobs/262424521#L664
Gave a couple more bugs a try in the sanitizer:
double free: attempting to free (deallocate) memory that's already been free'd. This threw errors in the tests both with and without the sanitizer, but the sanitizer caught it as...
AddressSanitizer: attempting free on address which was not malloc()-ed: 0x60700000a821 in thread T8
memory leak: not explicitly deallocating memory, and allowing it to "leak" or remain "reserved" or unable to be used by the program. This was caught by the sanitizer...
LeakSanitizer: detected memory leaks
Per chat - this ^^ is great news. I now propose pausing on this quest. There are ideas in the description of even more types of problems we could try to trigger (to see if the sanitizers catch the errors). But for the sake of time, probably best to use node-cpp-skel as a place to build-back checks when we hit real world problems. And not put more effort now into trying to simulate errors. I'll therefore close this and re-open new issues if/when we:
Context
Currently the scope of skel is best practices through working examples.
However in https://github.com/mapbox/node-cpp-skel/pull/29 we are experimenting with writing sample code that exhibits "common performance problems" like thread contention along with code that demonstrates idealized cpu usage for async code. The motivation is that:
thread contention
is quietly sinister: you can't see it, there is no obvious crash or error. Just a slow program.So, we plan to document the tell-tale signs of things like
thread contention
when seen via a profiler.Demonstrating diagnostics to catch hard to see bugs
Similar to
thread contention
that may be difficult to diagnose without specialized methods like function level profiling, there is a class of dangerous bugs that can cause silent memory corruption and are not often detectable without advanced methods.Therefore, we could consider adding example code (perhaps first in an advanced branch) that contains intentional dangerous bugs that corrupt memory. Then write documentation for how to detect them.
Examples/ideas:
node_g
(debug node build) and leverage the advanced checks in V8 that catch programming errors and abort: https://github.com/nodejs/node/blob/73ae2d1895c2c0d1d4eeaddd284c58d776d2be87/deps/v8/src/base/logging.hnode --verify-heap
flag to catch the problem. This could be tricky, but we can learn from how the v8 engineers try to do this: https://bugs.chromium.org/p/v8/issues/detail?id=2120