qchbai / gperftools

Automatically exported from code.google.com/p/gperftools
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

preamble_patcher fails on x64 platform (Windows Vista) #359

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Compile tcmalloc_minimal for release x64 target in VS 2008.
2. Link application against libtcmalloc_minimal.
3. Run application and receive the following error message:

Check failed: sidestep::SIDESTEP_SUCCESS == 
PreamblePatcher::Patch(windows_fn_[i], perftools_fn_[i], &origstub_fn_[i])

What is the expected output? What do you see instead?
I expect the application to patch the libc calls, but I get the above error 
message. I noticed in preamble_patcher.h:144

template <class T>
  static SideStepError Patch(T target_function,
                             T replacement_function,
                             T* original_function_stub) {
    // NOTE: casting from a function to a pointer is contra the C++
    //       spec.  It's not safe on IA64, but is on i386.  We use
    //       a C-style cast here to emphasize this is not legal C++.

My suspicion is that though the 1.8 release supports Windows x64 architectures, 
the subsequent casts to void pointers will not work with 64-bit addresses.

What version of the product are you using? On what operating system?
1.8, compiled for x64 architecture on Windows Vista using Visual Studio 2008.

Please provide any additional information below.
After entering PatchWindowsFunctions():1038, I checked the values of libc_info 
passed into PatchOneModuleLocked(libc_info);

-       libc_info   {windows_fn_=0x000000000012f270 is_valid_=false 
module_base_address_=0x000007fefdb10000 ...}    `anonymous-namespace'::LibcInfo
-       windows_fn_ 0x000000000012f270  void [14](void)*
        [0] 0x000007fefdb111ec  void (void)*
        [1] 0x000007fefdb111b8  void (void)*
        [2] 0x000007fefdb143c0  void (void)*
        [3] 0x000007fefdb12824  void (void)*
        [4] 0x0000000000000000  void (void)*
        [5] 0x0000000000000000  void (void)*
        [6] 0x0000000000000000  void (void)*
        [7] 0x0000000000000000  void (void)*
        [8] 0x0000000000000000  void (void)*
        [9] 0x0000000000000000  void (void)*
        [10]    0x0000000000000000  void (void)*
        [11]    0x0000000000000000  void (void)*
        [12]    0x000007fefdb11700  void (void)*
        [13]    0x000007fefdb45270  void (void)*
        is_valid_   false   bool
        module_base_address_    0x000007fefdb10000  const void *
        module_base_size_   638976  unsigned __int64

From there, the call stack moves into patch_functions.cc:624 and fails on i = 1 
in the for loop. Here is the dump of me_info:

-       me_info {windows_fn_=0x000000000012f270 is_valid_=false 
module_base_address_=0x000007fefdb10000 ...}    const 
`anonymous-namespace'::LibcInfo &
-       windows_fn_ 0x000000000012f270  void [14](void)*
        [0] 0x000007fefdb111ec  void (void)*
        [1] 0x000007fefdb111b8  void (void)*
        [2] 0x000007fefdb143c0  void (void)*
        [3] 0x000007fefdb12824  void (void)*
        [4] 0x0000000000000000  void (void)*
        [5] 0x0000000000000000  void (void)*
        [6] 0x0000000000000000  void (void)*
        [7] 0x0000000000000000  void (void)*
        [8] 0x0000000000000000  void (void)*
        [9] 0x0000000000000000  void (void)*
        [10]    0x0000000000000000  void (void)*
        [11]    0x0000000000000000  void (void)*
        [12]    0x000007fefdb11700  void (void)*
        [13]    0x000007fefdb45270  void (void)*
        is_valid_   false   bool
        module_base_address_    0x000007fefdb10000  const void *
        module_base_size_   638976  unsigned __int64

Finally, at pagemap.h:252, the value of root_ is NULL, so the following line 
throws a null pointer exception:

root_->ptrs[i1] == NULL || root_->ptrs[i1]->ptrs[i2] == NULL) {

I suspect that a double free has occurred somewhere prior to this, but I cannot 
be sure. Either way, something is going wrong. I appreciate any time you can 
devote to looking into this if it is indeed a bug. Thanks! I apologize for the 
inconvenience, but I cannot post screenshots of the debugger :(

Original issue reported on code.google.com by james.de...@remcom.com on 1 Aug 2011 at 7:00

GoogleCodeExporter commented 9 years ago
For further clarification, here are the relevant compiler/linker flags:

QMAKE_LFLAGS += /INCLUDE:"__tcmalloc"
PERFTOOLS_LIB_DIR = $${PERFTOOLS_DIR}/release
LIBS += -L$${PERFTOOLS_LIB_DIR} -llibtcmalloc_minimal

These are identical between 32-bit and 64-bit builds on Windows, the only 
difference being the actual .lib file for each target. The 32-bit builds do not 
exhibit this behavior with 1.7 or 1.8.

Original comment by james.de...@remcom.com on 1 Aug 2011 at 7:05

GoogleCodeExporter commented 9 years ago
Actually, I don't know if 1.8 works with 64-bit windows -- I just know we 
applied some patches that are supposed to help (I don't have a 64-bit windows 
machine to test on).

What I do know -- the same as you've discovered -- is that the patching won't 
work in 64-bit mode; we don't have a x86_64 disassembler.  You'll have to do 
the 'replacing libc routines' trick, as mentioned in the README_windows.txt 
file.u

I'm closing as WontFix, but really it's more like 
WontFixUnlessSomeoneWantsToSubmitAPatch.  I expect most people who want this 
will choose to go with the replacing-libc solution.

Original comment by csilv...@gmail.com on 1 Aug 2011 at 8:22

GoogleCodeExporter commented 9 years ago
Reopening -- it looks like this may be coming soon!

Original comment by csilv...@gmail.com on 21 Sep 2011 at 3:27

GoogleCodeExporter commented 9 years ago
I've completed a port of TCMalloc to Windows x64.

The major work done:

• Updated the disassembler to support REX prefixes which identify x64
instructions.
• Updated the patching engine to support patching x64 functions.  I
also added support for patching some JMP and CALL instructions in the
function preamble.

The port was developed against TCMalloc 1.6.  I have patched the
latest available version (1.8.3) with my changes and it successfully
builds and the test drivers run, but I haven't tested it as thoroughly
as it has been tested with 1.6.  Rather than updating the existing
solution I decided to create a new solution for 64-bit builds that is
compatible with Visual Studio 2005.  The solution contains x64
configurations for all the projects that are in the 32-bit build of
TCMalloc.  To test the x64 patching I added a new test driver called
patcher_test.  Some of this code has been pulled from whatever project
the disassembler/patcher originally came from (can't remember now).
This driver requires ml64.exe to build and is only in the 64-bit
solution.

The patch is fairly well commented and I've tried to follow the coding
style used throughout the TCMalloc source.  I've only built this using
Visual Studio 2005 and I'm not sure if it will compile/run with newer
versions of Visual Studio.  As far as testing goes, it has been run on
Windows 2003 x64 and Windows 2008 x64.  I would suspect that the code
should work on Windows 7. 

Original comment by scott.fr...@ca.ibm.com on 21 Sep 2011 at 4:40

Attachments:

GoogleCodeExporter commented 9 years ago
There's actually a small bug in the patch provided above.  I was using a static 
to initialize the new static members I added in PreamblePatcher.  Because 
static objects are initialized randomly between modules, this could result in 
the PreamblePatcher statics not being properly initialized when 
TCMallocGuard::TCMallocGuard() is executed.

The new patch fixes this.

Original comment by scott.fr...@ca.ibm.com on 21 Sep 2011 at 7:20

Attachments:

GoogleCodeExporter commented 9 years ago
I took a brief look at this.  I'm not really qualified to judge the code, but 
it looks impressive to me!

A few notes:

1) I see a few of the files are copyright Chromium.  Did you take them from 
chromium, or did you just put that copyright/license line on yourself?  If the 
latter, we'll want to replace it with the license text we use for the rest of 
perftools.  Otherwise, we'll have to make sure the relevant licenses are 
compatible, and/or rewrite the files.  I'm hoping to avoid all that work. :-}

2) We'd like to keep to a bare minimum the number of #ifdefs.  A lot of places 
you ifdef the code for X86.  Is it possible to define the stuff 
unconditionally, even if it's not used in 32-bit mode?  I'm thinking of things 
like ImmediateOperandSize (maybe rename to ImmediateOperandSizeForX86), 
operand_is_64_bits_, AllocPageNear, etc.  I'm also wondering if we can change 
all code to call, for instance, FreePreambleBlock (which may have an #ifdef 
inside it, if needed), rather than doing an ifdef to decide whether to call 
FreePreambleBlock or just delete[].  Or can we do the _alloca() unconditionally 
-- would it hurt anything to do so?  Basically, get rid of as many #ifdefs as 
possible, and to the extent possible, keep all the ones we have clustered in 
one place.

3) Is it possible to break this patch up a bit?  I'm thinking of at least two 
patches, one for the preamble-patching code, and one for everything else.  Or 
is it safe for me to just apply the patch to the preamble-patching files and 
ignore the rest (including patch_functions.cc)?  I'm thinking at least minor 
changes will be needed to patch_functions.cc.

I say this because the preamble patcher has a different set of authors than the 
rest of perftools, so it will be easier to break up the reviews that way.

Looking forward to seeing the next version of this patch!

Original comment by csilv...@gmail.com on 21 Sep 2011 at 9:11

GoogleCodeExporter commented 9 years ago
Thanks for the feedback!  Sorry for the delay, but I've been busy working on 
other projects and haven't been able to give this my full attention.

1) I did take those files from Chromium, unfortunately.  I don't know if it 
makes a difference, but those files are used only for the patcher_test project 
and are not needed for the TCMalloc DLL.  The license indicates that it's a 
BSD-style license which hopefully should be compatible with perftools.

2)  I took your advice and tried to remove a bunch of #ifdefs.  I was able to 
remove some of the #ifdef's in mini_disassembler.  I left the #ifdef's in 
ia32_opcode_map.  I could say, create a new file called amd64_opcode_map, but I 
thought it was reasonable to just leave the two #ifdef's in there to deal with 
the REX prefixes and MOV changes rather than duplicate the entire file.  There 
were some #ifdefs that I left in only because I believe it makes the code 
easier to understand.  For example, I've left the #ifdef's in 
PreamblePatcher::RawPatchWithStub() so you can clearly see what code is used 
for x64 patching.  I decided to use AllocPreambleBlock/FreePreambleBlock on 
both x86 and x64, which simplifies things a bit.  Overall, it's a fair bit 
cleaner.

3)  I've broken up the patch as you suggested.  I put any of the preamble code 
that I changed into one patch, and everything else into the other.

Original comment by scott.fr...@ca.ibm.com on 28 Sep 2011 at 12:56

Attachments:

GoogleCodeExporter commented 9 years ago
Thanks! -- I'll look at this when I can.  (The next two days are a holiday for 
me, and then there's catching up to do after that...)

I also think the licenses match -- the text we use is identical to the text at
   http://code.google.com/chromium/terms.html
So I think it's fine to just replace the 'see LICENSE file' text at the top of 
the chromium files with the actual content of the license, which you can take 
from the above page, or from any existing perftools file.

I'll ask the original sidestep authors to take a look at this.  In the 
meantime, a few style-nits I noticed:

1) It looks like this file does 'int *a', not 'int * a' as your patches do.
2) When you add parameters like 'stop_before_trampoline' to a function, can you 
comment what they mean?
3) You do 'cond ? true : false' in a few places.  I think many of them you 
don't need the '? true : false' -- cond itself is already a boolean.  Can just 
remove it.
4) Really minor nit: you often have two-line statements ('return foo 
&&\nbar;').  They're indented a bit inconsistently.  I usually use parens 
('return (foo &&\nbar);') to make sure my editor always lines these up nicely.  
Not a big deal.
5) Seeing '#ifdef _M_X64\n}' makes me wig out a little: I understand what's 
going on in here, but it seems awkward.  Can you rewrite this bit so the ifdef 
is cleaner, perhaps by splitting this up into more functions?

Finally, I'm trying something new with reviewing this patch.  I've uploaded the 
patch to reitveld:
   http://codereview.appspot.com/5150041

I'll ask folks to comment on it there.  I'm not very familiar with the tool, 
but I think it should be possible for you to upload updated versions of the 
patch as well.  When that's all done, I'll give you commit access to the 
repository so you can just commit it.

And one contentful comment:
} +#ifdef _M_X64
} +// Let's give it a little more for 64-bit mode.
} +#define MAX_PREAMBLE_STUB_SIZE    (64)
} +#else
}  #define MAX_PREAMBLE_STUB_SIZE    (32)
} +#endif

Why do we need more in 64-bit mode?  Looking at the comment about why this is 
the size it is, it seems like it should be the same in 32-bit and 64-bit.  Is 
that not so?  If not, can you comment a bit more about why more space is needed 
in 64-bit mode?

Original comment by csilv...@gmail.com on 28 Sep 2011 at 2:15

GoogleCodeExporter commented 9 years ago
Oh, just to be clear: I think what you need to do is go to 
http://codereview.appspot.com/5150041 and star it (the star is right next to 
the 'Issue 5150041').  That *should* cause you to get an email comment every 
time someone comments on the patch (I think).

Also, you may not have permission to upload new versions of the patch, since I 
uploaded the original.  If not, feel free just to upload your next new version 
as a 'new issue', and mark the new coderview url here.

Original comment by csilv...@gmail.com on 28 Sep 2011 at 2:17

GoogleCodeExporter commented 9 years ago
I've added the latest patch to http://codereview.appspot.com/5164044/

Original comment by scott.fr...@ca.ibm.com on 30 Sep 2011 at 7:39

GoogleCodeExporter commented 9 years ago
It looks like the code review has finished up at 
http://codereview.appspot.com/5164044/
.  Many thanks to Joi for his help!  Can you attach the latest version of the 
patch here (as a patch file)?  I'll look to apply it.

My feeling was that I'd like to apply this first, and then do the vcproj 
changes in a separate commit.  Will that work, or will this commit break the 
vcproj's?  I'm hoping not to have to have a separate x64 solution, since it's 
hard enough to maintain the one I have.  But I don't want that discussion to 
gate getting this checked in.

Original comment by csilv...@gmail.com on 7 Oct 2011 at 12:22

GoogleCodeExporter commented 9 years ago
You should be able to apply the preamble patch without the vcproj changes.  I 
didn't have to make any changes to the existing vcproj files, so I believe 
everything should still build correctly in x86 mode.

I guess it's up to you on how to configure the x64 solution.  The reason I 
added a separate solution is because VS2003 doesn't support x64 and I didn't 
want to update the existing projects to VS2005.

I've included the latest patch SideStep patch - you'll still have to apply the 
earlier patch for the non-SideStep files to actually build an x64 solution.

Original comment by scott.fr...@ca.ibm.com on 7 Oct 2011 at 2:59

Attachments:

GoogleCodeExporter commented 9 years ago
Drat!  I didn't realize vs2003 didn't support x64.  I will have to think about 
this.  It may make more sense to just update to VS2005.  Or to keep the two 
versions like your patch does.  Or just move to cmake. :-)

Anyway, I'll go check this in right now.

Original comment by csilv...@gmail.com on 7 Oct 2011 at 8:59

GoogleCodeExporter commented 9 years ago
Oh, one question before I do: have you signed the CLA, at
   http://code.google.com/legal/individual-cla-v1.0.html

I'll need that before I can accept the patch.

Thanks!
craig

Original comment by csilv...@gmail.com on 7 Oct 2011 at 9:04

GoogleCodeExporter commented 9 years ago
I have signed and submitted the corporate CLA.

Original comment by sc...@planetscott.ca on 7 Oct 2011 at 9:21

GoogleCodeExporter commented 9 years ago
Thanks!  (Is this the CLA for IBM, or for planetscott, or something else?)

For corporate CLAs, we require folks to mail in a PDF I think -- is that what 
you did?  If so, it will take a few days I'm guessing before the record shows 
up internally here.  Once that's done, I'll submit.  (btw, was this corporate 
CLA for IBM, planetscott, or something else?)

Original comment by csilv...@gmail.com on 7 Oct 2011 at 9:31

GoogleCodeExporter commented 9 years ago
Sorry, I replied with my wrong account from home!  The CLA was for IBM.

I did not mail in the PDF, but emailed a signed copy back - this was a couple 
of weeks ago.  Is that good enough?  Maybe check and see if we're registered in 
the system... let me know.

Original comment by scott.fr...@ca.ibm.com on 9 Oct 2011 at 8:45

GoogleCodeExporter commented 9 years ago
I'll look into it.  Thanks!

Original comment by csilv...@gmail.com on 10 Oct 2011 at 6:43

GoogleCodeExporter commented 9 years ago
OK, our guess here is that you don't have rights to sign a corporate CLA on 
behalf of IBM.  You'd need to talk to lawyers on the IBM side to see what the 
right approach is.

Alternately, if appropriate you could sign the individual CLA.  I'm not a 
lawyer and can't say under what situations the individual CLA is appropriate, 
and in what situations the corporate CLA is appropriate.

Original comment by csilv...@gmail.com on 10 Oct 2011 at 8:29

GoogleCodeExporter commented 9 years ago
Ok, I'll investigate on this side.  We've had to run the legal gamut just to 
get internal approval to submit the code back to you guys.  

On a somewhat related note, would you be interested in patches to support 
Solaris SPARC, AIX, HPUX Itanium and z/Linux? 

Original comment by scott.fr...@ca.ibm.com on 11 Oct 2011 at 12:00

GoogleCodeExporter commented 9 years ago
Wow, a plethora of porting!  I always love it when perftools can run in more 
places.

That said, we're moving towards a policy of not doing too much to support 
less-used platforms, unless the change only trivially impacts the codebase 
(either because it's small changes, or mostly new files that most platforms can 
ignore).  So for platforms where that's the case, we'd be glad to accept 
patches.

The other potential issue I see is that I don't have the ability to test any of 
these, so it's very possible they could regress in a future release.  I'm not 
sure what to do about that, if anything.  For some, it may be better to 
maintain them as a separate fork.

Anyway, if you're up for putting together some patches, I'm happy to take a 
look at them.  We can figure out what makes sense.

Original comment by csilv...@gmail.com on 11 Oct 2011 at 6:31

GoogleCodeExporter commented 9 years ago
Hi Craig,

Do you know who would have the rights to sign a corporate CLA on behalf of IBM? 
 Internally, we went through all the hoops to get the approval - I guess it's 
just a matter of finding the right person to sign the CLA.

Original comment by scott.fr...@ca.ibm.com on 12 Oct 2011 at 4:53

GoogleCodeExporter commented 9 years ago
No, I'm sorry, I don't know.  And I'm sorry you have to go through more hoops 
after going through the original ones! -- hopefully these extra ones won't be 
too bad.

Original comment by csilv...@gmail.com on 12 Oct 2011 at 6:28

GoogleCodeExporter commented 9 years ago
Any more word on this?   I'm ready to apply as soon as the legal stuff is taken 
care of!

Original comment by csilv...@gmail.com on 26 Oct 2011 at 12:40

GoogleCodeExporter commented 9 years ago
Pinging this one again...

Original comment by csilv...@gmail.com on 10 Nov 2011 at 11:07

GoogleCodeExporter commented 9 years ago
Anoher month, another ping...

Original comment by csilv...@gmail.com on 5 Dec 2011 at 7:10

GoogleCodeExporter commented 9 years ago
Issue 391 has been merged into this issue.

Original comment by csilv...@gmail.com on 10 Jan 2012 at 3:00

GoogleCodeExporter commented 9 years ago
Great, got the CLA!  This will be in the next release.

Original comment by csilv...@gmail.com on 25 Jan 2012 at 11:42

GoogleCodeExporter commented 9 years ago
I'm trying to compile preamble_patcher_test.cc, which was part of your patch, 
and it has the line:
   #include "auto_testing_hook.h"

But I don't see auto_testing_hook in the patch you gave.  Was that supposed to 
be in there too?  If so, can you attach it here?

Original comment by csilv...@gmail.com on 28 Jan 2012 at 12:26

GoogleCodeExporter commented 9 years ago
Also, how do I compile shortproc.asm in MSVC?  Is there an appropriate build 
rule or command for it?

Original comment by csilv...@gmail.com on 31 Jan 2012 at 12:03

GoogleCodeExporter commented 9 years ago
This patch is in perftools 1.10, just released!  I did not enable 
preamble_patcher_test.cc by default in MSVC, since it seems to be 64-bit only, 
and I had to guess at the right assembler command for shortproc.asm.  I took 
auto_testing_hook.h from another project I found, but it seems to compile with 
the version I have, so I'm guessing it's ok.  Anyway, I think this is good to 
build on!

Original comment by csilv...@gmail.com on 31 Jan 2012 at 7:17

GoogleCodeExporter commented 9 years ago

Original comment by csilv...@gmail.com on 31 Jan 2012 at 7:17