Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Pointer alignment assumptions incorrect for User*, causing bad behaviour in Use #4585

Closed Quuxplusone closed 14 years ago

Quuxplusone commented 15 years ago
Bugzilla Link PR4119
Status RESOLVED FIXED
Importance P normal
Reported by Stefanus Du Toit (sdt@rapidmind.com)
Reported on 2009-05-01 18:08:32 -0700
Last modified on 2010-03-06 13:58:20 -0800
Version trunk
Hardware PC Windows XP
CC anton@korobeynikov.info, foldr@codedgers.com, llvm-bugs@lists.llvm.org
Fixed by commit(s)
Attachments pointer_traits_workaround.diff (979 bytes, text/plain)
Blocks
Blocked by
See also
Created attachment 2917
Workaround that sets the default available bits to 2 instead of 3

This is a bug that's technically been around for a while, but was uncovered by
r67979. It causes various random crashes under Windows.

From an email to llvm-dev:

The issue is with PointerLikeTypeTraits<T*>::NumLowBitsAvailable. This is set
to 3, which basically assumes that unless the traits are specialized for a
particular pointer type, objects of that type are allocated with malloc() and
aligned to 8 bytes.

While PointerLikeTypeTraits is overloaded for Use*, it is not overloaded for
User*. lib/VMCore/Use.cpp uses a PointerIntPair<User*, 1, Tag>, and things go
bad. Users are typically allocated in an array after a bunch of Uses, which are
12 bytes in size, so they may actually only be aligned to 4 bytes.

The attached patch (which I don't intend to commit, it's just a workaround)
works around this simply by dropping this down to 2 bits, which gets us past
our problem on Windows.

To actually solve this properly I see two main options:

(1) we could specialize PointerLikeTypeTraits for User*, and leave the default
value of NumLowBitsAvailable at 3.
(2) we could drop the default NumLowBitsAvailable to 2 (or even use _alignof
and similar compiler-specific extensions to determine it), and allow classes
that assert that they are only ever allocated through malloc to relax it back
up to 3.

My preference would be for option (2), due to the difficulty of tracking this
bug down, and the risk of future similar bugs happening. However, I'd
appreciate some feedback from the LLVM developer community on this. Please let
me know what you think, and I'll be happy to prepare a patch.

I'm still wondering why this wasn't an issue on other platforms. One
possibility is that Use is being bumped up to 16 bytes in size, and thus Users
never get placed at less than an 8-byte boundary. However, in that case, the
whole point of the "use diet" optimization is lost! I'll investigate and try to
find out.

I'm also still not sure why the assertion in PointerIntPair::setPointer() did
not get triggered by the User::allocHungOffUses() implementation in Use.cpp.
Perhaps the assertion is wrong (it looks reasonable, though) or perhaps there
is something else going on I haven't seen yet. I'll look into this some more as
well.
Quuxplusone commented 15 years ago

Attached pointer_traits_workaround.diff (979 bytes, text/plain): Workaround that sets the default available bits to 2 instead of 3

Quuxplusone commented 15 years ago

Why doesn't this cause an assertion?

Quuxplusone commented 15 years ago
That's what I was wondering too (see above). I'm planning to look into it.

One hypothetical possibility is the following:
 - a ptrintpair gets constructed with an "opaque value"
 - the tag value then gets read
 - a decision gets made from the tag value

This might be happening in User::getUser() (implemented in Use.cpp:141) but as
far as I can tell the ptrintpairs aren't initialized in this manner.
Quuxplusone commented 15 years ago
I dropped the default to 2 bits:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090427/076954.html

And added a patch to clang to compensate for this:
http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20090427/016609.html

I'm still curious why no assertion happens, it could be a miscompilation by
your host compiler or something.
Quuxplusone commented 15 years ago

OK. I still plan on looking into this. I also want to know why this didn't cause problems on Linux, since theoretically, it should have.

Note that the comment above that enum is now out-of-date.

Quuxplusone commented 15 years ago

For an explanation of what happened, see this thread:

http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-May/022056.html