mchbani / google-breakpad

Automatically exported from code.google.com/p/google-breakpad
0 stars 0 forks source link

linked_ptr segfaults oin some cases #269

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Download the attach file
2. Tun minidump_stackwalk using the given dump with the given symbols

What is the expected output? What do you see instead?
It should print a correct analyzed stack but it segfaults.

What version of the product are you using? On what operating system?
SVN revision 281.

Please provide any additional information below.
I attached a patch that correct the problem but because it is a little
complicated, I'll explain what I did.

When the bug occurred, I looked for the problem and found it was coming
from the implementation of the smart pointers. They were using a circular
list to know if we could delete the pointer or not, the whole list
containing an instance for every reference to the value.

I got into several bugs using a same report :
- an infinite loop
- a segfault

When correcting the infinite loop was complicated, getting rid of the
segfault was easiest so I looked for the problem. The problem happened when
we browsed the circular list. At some point, it caused a segfault.

I tried to understand why such a circular list was used for. Since they
were no outside access to it, I deduced it was only to know when there is
no more reference on our value to delete it when necessary. I thought that
a simple pointer on an integer replacing the list in order to count how
many references might be a better implementation to achieve the same goals.
It would be more efficient, and gaining even tens of milliseconds in this
executable is important, and less likely to crash.

I started to change the linked_ptr implementation to match that idea. Even
if their name is now not really telling how they are working, it seems that
they are working a lot better. The test suite run smoothly and I don't
encounter my segfaults anymore.

Moreover, my infinite loop is also gone which is a good point.

So here is the patch getting from a circular list implementation to a
simple int* version. It might improve overall performance of everything so
if there is no risk to break anything using that implementation, I suggest
to use that.

Comments are not fixed in the file, I just give this patch to discuss this
issue. If it is to be applied, I would be pleased to produce a version that
fixes them too.

Original issue reported on code.google.com by raynaudq...@gmail.com on 7 Jul 2008 at 5:30

Attachments:

GoogleCodeExporter commented 9 years ago
New patch:
- Suppress some leaks (now pass a valgrind test)
- Prevent some useless allocations

Original comment by raynaudq...@gmail.com on 7 Jul 2008 at 7:03

Attachments:

GoogleCodeExporter commented 9 years ago
What is boring me in this story is that replacing the pointer implementation 
actually
gets rid of a potential segfault and a potential infinite loop but it makes me 
crazy
to be unable to see why the actual version is causing those. I'll maybe take a 
close
look about that in a near future, if I find time for it at least.

Original comment by raynaudq...@gmail.com on 9 Aug 2008 at 10:06

GoogleCodeExporter commented 9 years ago
I bet running this through valgrind would prove illuminating. I've seen a 
similar
crash, but only when I compile at -O2 with some local changes of mine. Kind of 
scary!

Original comment by ted.mielczarek on 25 Nov 2009 at 7:16

GoogleCodeExporter commented 9 years ago
Well, I thought that this bug was somehow forgotten. Well, my patch has been in
production for a year and it is still working which might be a good indicator. I
really don't understand why the actual version is buggy but at least I know 
that my
version is much better for perfs so it might be a good idea to take it if only 
for
that reason...

Original comment by raynaudq...@gmail.com on 25 Nov 2009 at 9:09

GoogleCodeExporter commented 9 years ago
I haven't received any feedback yet on the content of this patch which is now 
over 2 years old. Is there something that I've missed, like some what to 
request review?

If the patch is now bitrotted, I would be willing to do the work to update it, 
but not before receiving at least some feedback indicating that the approach is 
correct and the patch is likely to be accepted.

Now it feels as if my patch has fallen in a void, which is quite disappointing 
given that it seemed to me it could provide a significant performance win in 
addition to the improved stability.

Original comment by raynaudq...@gmail.com on 11 Jul 2010 at 5:00

GoogleCodeExporter commented 9 years ago
The patch is still very applicable.  All of our breakpad instances are 
practically unusable without this patch, so applying it to the source is a 
required first step.  I'd love to see it get merged so we didn't have to apply 
it every time.

Original comment by brandon%...@gtempaccount.com on 15 Jul 2010 at 6:51

GoogleCodeExporter commented 9 years ago
I'm happy to see I'm not the only one that has experienced this issue. I was 
starting to believe noone had the problem :).

Did you eperienced both the segfault and the infinite loop without the patch or 
only one of those effects?

Original comment by raynaudq...@gmail.com on 15 Jul 2010 at 10:37

GoogleCodeExporter commented 9 years ago
I experience the segfault. I'll be trying to patch right now

Original comment by delfin.r...@gmail.com on 23 Aug 2010 at 7:38

GoogleCodeExporter commented 9 years ago
Works!

Original comment by delfin.r...@gmail.com on 23 Aug 2010 at 8:00