google / syzygy

Syzygy Transformation Toolchain
Apache License 2.0
355 stars 59 forks source link

ReactOS binaries can't be instrumented #3

Open sebmarchand opened 9 years ago

sebmarchand commented 9 years ago

From @GoogleCodeExporter on March 19, 2015 16:27

What steps will reproduce the problem?

1. Compile any ReactOS module.
2. Try to ASAN instrument it by following the instructions in the wiki.

What is the expected output? What do you see instead?

The instrumenter outputs errors that are about COFF, SEH table...etc

Trying with /SAFESEH fixes the SEH table issue, but the COFF errors remain, 
leading to no output file(s). Please note that we don't use /SAFESEH, we just 
tried it to test further.

What version of the product are you using? On what operating system?

We used the binaries provided in your repo.

Please provide any additional information below.

We're trying the ASAN instrumentation on ReactOS binaries as we aim for binary 
compatibility with Windows.

Original issue reported on code.google.com by amine.kh...@reactos.org on 11 Sep 2014 at 11:02

Copied from original issue: sebmarchand/syzygy#1

sebmarchand commented 9 years ago

From @GoogleCodeExporter on March 19, 2015 16:27

Sorry, I've just seen this. Have you made some progress on this ?

Original comment by sebmarchand@chromium.org on 2 Dec 2014 at 4:15

sebmarchand commented 9 years ago

From @GoogleCodeExporter on March 19, 2015 16:27

The COFF support is very early and experimental, so instrumenting compilands is 
not really expected to work right now. Is that what you're doing?

Original comment by chri...@chromium.org on 2 Dec 2014 at 4:25

sebmarchand commented 9 years ago

From @GoogleCodeExporter on March 19, 2015 16:27

[deleted comment]
sebmarchand commented 9 years ago

From @GoogleCodeExporter on March 19, 2015 16:27

I'm sorry, by COFF support you mean PE-COFF ? because when we found the 
project, especially "Syzygy is a suite of tools to perform post-link 
instrumentation and optimization of 32-bit Windows binaries." we had the 
impression that it works on binaries compiled with the MSVC compiler toolchain, 
which we (ReactOS) use along with GCC. What do you folks have in mind with 
"32-bit Windows binaries" ?

Original comment by amine.kh...@reactos.org on 2 Dec 2014 at 4:29

sebmarchand commented 9 years ago

From @GoogleCodeExporter on March 19, 2015 16:27

It should work for final PE binaries compiled by MSVC (ie: .dll and .exe 
files). It *sometimes* works for COFF files (ie: .obj files), but that's 
experimental code.

Original comment by chri...@chromium.org on 2 Dec 2014 at 4:32

sebmarchand commented 9 years ago

From @GoogleCodeExporter on March 19, 2015 16:27

Ah, COFF files, I see. We used PE binaries compiled by MSVC indeed, not COFF 
files, and we had the issues mentioned above. We can try with any test builds 
you may provide in order to give you better details about the issues.

Original comment by amine.kh...@reactos.org on 2 Dec 2014 at 4:34

sebmarchand commented 9 years ago

From @GoogleCodeExporter on March 19, 2015 16:27

Here's a simple test using calc.exe (calculator)

e:\syzygy_bin\exe>instrument.exe --mode=asan --input-image=calc.exe 
--output-image=calc_output.exe
[1202/175124:INFO:application_impl.h(45)] Syzygy Instrumenter Version 0.7.16.0 
(2153).
[1202/175124:INFO:application_impl.h(47)] Copyright (c) Google Inc. All rights 
reserved.
[1202/175124:INFO:instrumenter_with_agent.cc(100)] Default agent DLL for asan 
mode is "syzyasan_rtl.dll".
[1202/175124:INFO:pe_relinker_util.cc(335)] Input PDB not specified, searching 
for it.
[1202/175124:INFO:pe_relinker_util.cc(361)] Using default output PDB path: 
e:\syzygy_bin\exe\calc_output.exe.pdb
[1202/175124:INFO:pe_relinker.cc(138)] Input module : e:\syzygy_bin\exe\calc.exe
[1202/175124:INFO:pe_relinker.cc(139)] Input PDB    : 
e:\testing_v3_msvc\reactos\msvc_pdb\calc.pdb
[1202/175124:INFO:pe_relinker.cc(140)] Output module: 
e:\syzygy_bin\exe\calc_output.exe
[1202/175124:INFO:pe_relinker.cc(141)] Output PDB   : 
e:\syzygy_bin\exe\calc_output.exe.pdb
[1202/175124:INFO:pe_relinker.cc(57)] Decomposing module: 
e:\syzygy_bin\exe\calc.exe
[1202/175124:ERROR:decomposer.cc(2091)] Unable to add block "Bracketed COFF 
group: .tls" at Relative(0x0000D000) with size 26.
[1202/175124:ERROR:decomposer.cc(1610)] Failed to create bracketed COFF group 
".tls".
[1202/175124:ERROR:pe_relinker.cc(66)] Unable to decompose module: 
e:\syzygy_bin\exe\calc.exe
[1202/175124:ERROR:instrumenter_with_agent.cc(133)] Failed to initialize 
relinker.

The attached calc.7z contains both the executable and its PDB file.

Original comment by amine.kh...@reactos.org on 2 Dec 2014 at 4:54

Attachments:

sebmarchand commented 9 years ago

From @GoogleCodeExporter on March 19, 2015 16:27

There is indeed a bug in our decomposer, which will be fixed soon. However, I 
can't get any further with those binaries because they're not build with 
profiling info. The next error you'll run into is:

PDB file does not contain a FIXUP stream. Module must be linked with '/PROFILE' 
or '/DEBUGINFO:FIXUP' flag.

Can you rebuild with those flags and I'll see how far I can get?

Original comment by chri...@chromium.org on 3 Dec 2014 at 4:35

sebmarchand commented 9 years ago

From @GoogleCodeExporter on March 19, 2015 16:27

Sure, I'll attach a new version in a moment.

Original comment by amine.kh...@reactos.org on 3 Dec 2014 at 4:37

sebmarchand commented 9 years ago

From @GoogleCodeExporter on March 19, 2015 16:27

Original comment by amine.kh...@reactos.org on 3 Dec 2014 at 4:40

Attachments:

sebmarchand commented 9 years ago

From @GoogleCodeExporter on March 19, 2015 16:27

Okay, that bug was the only problem and I can now instrument and run calc.exe 
no problems. I'll put together a fix shortly.

Original comment by chri...@chromium.org on 3 Dec 2014 at 5:07

sebmarchand commented 9 years ago

From @GoogleCodeExporter on March 19, 2015 16:27

Please provide us with binaries when you do so. We'd like to instrument much 
more modules after this, and report back if we hit any other issues. Thanks!

Original comment by amine.kh...@reactos.org on 3 Dec 2014 at 5:11

sebmarchand commented 9 years ago

From @GoogleCodeExporter on March 19, 2015 16:27

So we're seeing something a little weird. Normally, we see the TLS data 
directory (a IMAGE_TLS_DIRECTORY struct, referred to by the PE headers) live in 
the .rdata section (as you'll see it declared in tlssup.c of the CRT). The 
initializers end up folding into the .tls section, and are pointed to by the 
TLS data directory.

In the image you provided us the TLS directory lives in the .tls struct. While 
this isn't strictly an error, it goes against anything we've ever seen produced 
by any MSVC built binaries. Are you using a custom ReactOS CRT? What are the 
contents of the tlssup.c source file, or whatever source file(s) declares 
_tls_used, _tls_start and _tls_end?

Original comment by chri...@chromium.org on 3 Dec 2014 at 6:32

sebmarchand commented 9 years ago

From @GoogleCodeExporter on March 19, 2015 16:27

https://git.reactos.org/?p=reactos.git;a=blob;f=reactos/lib/sdk/crt/startup/tlss
up.c;hb=093746813fd0185c68c8d451349cd5e0133da836

We have our own CRT indeed, as we can't afford to rely on anything (we're an 
OS). We also have our native/MSVC/GCC/Clang compatible headers set.

Original comment by amine.kh...@reactos.org on 3 Dec 2014 at 7:51

sebmarchand commented 9 years ago

From @GoogleCodeExporter on March 19, 2015 16:27

Okay, that explains the issue. The MSVS CRT looks more like the following:

https://code.google.com/p/reactos-mirror/source/browse/branches/ros-amd64-bringu
p/reactos/lib/3rdparty/mingw/tlssup.c?r=38477

Note that _tls_used goes in .rdata. Also note that the 'bracketing' symbols are 
.tls and .tls$ZZZ, not .tls$AAA and .tls$ZZZ. Our toolchain is *very* intimate 
with the details of the MSVS toolchain, including section names, variable 
names, etc. This was causing us to try to decompose two things at coincident 
locations and hence explode when they collide. We can fix this from our end 
(there's actually a bit of a logical error in our code), but if you make your 
tlssup.cc look more like the actual CRT one, this problem would also go away.

Original comment by chri...@chromium.org on 3 Dec 2014 at 8:09

sebmarchand commented 9 years ago

From @GoogleCodeExporter on March 19, 2015 16:27

With http://pastebin.com/snZnmCbT here is another build of calc.

Original comment by amine.kh...@reactos.org on 3 Dec 2014 at 10:43

Attachments:

sebmarchand commented 9 years ago

From @GoogleCodeExporter on March 19, 2015 16:27

With that pastebin CRT fix change you should be able to instrument calc 
yourself, without any changes from us. If you upstream that fix, then this 
problem should go away entirely. Have you tried?

Original comment by chri...@chromium.org on 5 Dec 2014 at 2:47

sebmarchand commented 9 years ago

From @GoogleCodeExporter on March 19, 2015 16:27

I confirm, and I will commit this fix.

I wrote a quick use-after-free test and ran it under windbg, but I couldn't see 
I report similar to 
https://code.google.com/p/syzygy/wiki/SyzyASanHowTo#Interpret_the_error_report.

Instead I got: http://pastebin.com/KU3iBF5F from which you cannot even get the 
line that triggered the issue.

Is it possible to get the reports style mentioned above, in windbg, so that we 
could run the full ReactOS test suite and actually get results we can inspect?

Please note that using agent_logger.exe is not an option for us because we're 
2k3 Sp1 (not Vista+).

Original comment by amine.kh...@reactos.org on 6 Dec 2014 at 12:31

sebmarchand commented 9 years ago

From @GoogleCodeExporter on March 19, 2015 16:27

If you can't use agent_logger then you'll have to run the code under windbg as 
discussed in the same section:

https://code.google.com/p/syzygy/wiki/SyzyASanHowTo#Interpret_the_error_report

We don't do symbolization in the RTL because it needs to work with Chrome, 
which runs in a sand-boxed process. Hence the RPC mechanism and the 
agent_logger.

I'm not familiar enough with 2k3; why won't the agent logger run there?

Original comment by chri...@chromium.org on 8 Dec 2014 at 4:08

sebmarchand commented 9 years ago

From @GoogleCodeExporter on March 19, 2015 16:27

[deleted comment]
sebmarchand commented 9 years ago

From @GoogleCodeExporter on March 19, 2015 16:27

I confirm that the ASAN commands give some insight on the issues. As to the 
logger, the required OS, and subsystem version, are 6, so they won't work with 
2k3. Also, it may not run on ReactOS for compatibility reasons (we're not there 
yet) and that's why I preferred having a working WinDBG solution.

I'll try to instrument some modules + tests and run them in ReactOS while 
attaching WinDBG, and report back.

Original comment by amine.kh...@reactos.org on 10 Dec 2014 at 1:05