Open petterreinholdtsen opened 8 years ago
You may be right about coz missing thread creation, but the one other possibility is that openalpr is fork
ing a new process. Could you verify that alpr
is running in entirely in a single process?
One other possibility is that something in a module-level constructor is accessing thread states before coz is fully initialized. I will look into this.
[Charlie Curtsinger]
You may be right about coz missing thread creation, but the one other possibility is that openalpr is
fork
ing a new process. Could you verify thatalpr
is running in entirely in a single process?
Not quite sure how to verify it, but running 'ltrace alpr ... 2>&1 | grep fork' did not find any fork()s.
One other possibility is that something in a module-level constructor is accessing thread states before coz is fully initialized. I will look into this.
Great. You can find the openalpr code using <URL: https://github.com/openalpr/openalpr.git >.
Happy hacking Petter Reinholdtsen
I investigated some more, and suspect the issue at hand is that threads are created before main() is called, in shared library initializers. Here is the gdb backtrace for the first call to pthread_create():
(gdb) bt
#0 __pthread_create_2_1 (newthread=0x7fffe6666538, attr=0x0, start_routine=0x7fffe645df50,
arg=0x0) at pthread_create.c:461
#1 0x00007fffe645e16f in ?? () from /lib/x86_64-linux-gnu/libusb-1.0.so.0
#2 0x00007fffe645c3c5 in ?? () from /lib/x86_64-linux-gnu/libusb-1.0.so.0
#3 0x00007fffe6454f5f in libusb_init () from /lib/x86_64-linux-gnu/libusb-1.0.so.0
#4 0x00007fffecb3fa4c in ?? () from /usr/lib/x86_64-linux-gnu/libdc1394.so.22
#5 0x00007fffecb2e015 in dc1394_new () from /usr/lib/x86_64-linux-gnu/libdc1394.so.22
#6 0x00007ffff57b51d9 in CvDC1394::CvDC1394() ()
from /usr/lib/x86_64-linux-gnu/libopencv_highgui.so.2.4
#7 0x00007ffff579fc40 in ?? () from /usr/lib/x86_64-linux-gnu/libopencv_highgui.so.2.4
#8 0x00007ffff7dea9ba in call_init (l=<optimized out>, argc=argc@entry=2,
argv=argv@entry=0x7fffffffdf48, env=env@entry=0x7fffffffdf60) at dl-init.c:78
#9 0x00007ffff7deaaa3 in call_init (env=0x7fffffffdf60, argv=0x7fffffffdf48, argc=2,
l=<optimized out>) at dl-init.c:36
#10 _dl_init (main_map=0x7ffff7ffe1a8, argc=2, argv=0x7fffffffdf48, env=0x7fffffffdf60)
at dl-init.c:126
#11 0x00007ffff7ddd1ca in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
#12 0x0000000000000002 in ?? ()
#13 0x00007fffffffe289 in ?? ()
#14 0x00007fffffffe297 in ?? ()
#15 0x0000000000000000 in ?? ()
(gdb)
I guess coz need to hook into the program startup code a bit earlier than wrapped_main() is doing it.
I was able to avoid the crash using the following patch:
diff --git a/libcoz/profiler.h b/libcoz/profiler.h
index 515d998..032e2ea 100644
--- a/libcoz/profiler.h
+++ b/libcoz/profiler.h
@@ -108,6 +108,8 @@ public:
thread_start_arg* new_arg;
thread_state* state = get_thread_state();
+ if (NULL == state)
+ state = add_thread();
REQUIRE(state) << "Thread state not found";
// Allocate a struct to pass as an argument to the new thread
The idea is to not depend on wrapped_main() being called to set up the state pointer. I suspect it is not a safe solution, but just wanted to let you know that it got rid of the crash. Unfortunately I did not get much information in the profile.coz file:
startup time=1468705859216684133
runtime time=223636086700
I suspect this is because I patched the wrong function in openalpr.
Hi. Is there more debugging information I can provide to help you with this?
I'm not sure there's a good way to fix this issue at the moment. My attempts to initialize the profiler in the replaced __libc_start_main
lead to segfaults (probably because static initializers have not yet run), but deferring initialization until after static initializers run means we'll get the same error you're seeing. We could silently ignore missing thread state, but that means the profiler wouldn't be collecting samples for the thread created before main
runs.
I'll have to put this issue on hold for a bit.
Implementing #60 should fix this issue, although this feature is in very early stages.
To test coz-profiler, I decided to try to profile the openalpr command line tool, available from http://www.openalpr.com/ . It is a image processing system written in C++, used for extracting car license plates from images and video.
I modified the code like this to try benchmark how quickly it could process images.
After building, I ran it with a random MP4 vidoe file I had lying around after generating a simple openalpr.conf file:
Instead of getting a useful profiling run done, I got the error message "[./profiler.h:111] Thread state not found" instead. Am I using coz the wrong way?
Looking at the code, I suspect the problem is that openalpr is bootstraping its threads the wrong way, but I am unable to pinpoint exactly what is wrong. Also, I suspect coz should handle any code, and thus decided to report the problem as an issue.