m4b / goblin

An impish, cross-platform binary parsing crate, written in Rust
MIT License
1.19k stars 160 forks source link

PE: Add tests for TLS parser and characteristics constants #426

Open kkent030315 opened 3 weeks ago

kkent030315 commented 3 weeks ago

[^1]: This is something odd; I could not repricate the same on my local dev env. Perhaps its 32 v. 64 host binary issue. Solved in here.

kkent030315 commented 3 weeks ago

Stuck at #428, need to solve that issue for parse_special_import_fowarder_tls test right here.

kkent030315 commented 3 weeks ago

Will soon be fixed in #429.

kkent030315 commented 4 days ago

@m4b Hi there, I'd like to get review for this PR as well, it's ready to go. :)

kkent030315 commented 3 days ago

Decreased size of test binaries (~9KB), here's the practical considerations for further reproduce:

# .cargo/config.toml

[target.x86_64-pc-windows-msvc]
# Required by `build-std-features=panic_immediate_abort`
rustflags = ["-C", "panic=abort"]
# LLD links much smarter, and without POGO/rich header
linker = "rust-lld.exe"

[unstable]
# Do not use precompiled, fully compile them to apply our optimizations
build-std = ["core", "std", "panic_abort"]
# Entiely strip format from panic handlers
build-std-features = ["panic_immediate_abort"]
# Cargo.toml

[profile.release]
panic = "abort"
lto = true
opt-level = "s"
codegen-units = 1
strip = true
// build.rs

fn main() {
    // `no_main` requires this
    println!("cargo:rustc-link-arg=/SUBSYSTEM:CONSOLE");
    // Do not include default manifest (no resource)
    println!("cargo:rustc-link-arg=/MANIFEST:NO");
    // Do not generate base relocations (DO NOT do this for production)
    println!("cargo:rustc-link-arg=/FIXED");
    // No debug directory (strip=true do but in case)
    println!("cargo:rustc-link-arg=/DEBUG:NONE");
    // No unwind info (DO NOT do this for production; without `.pdata`)
    println!("cargo:rustc-link-arg=/SAFESEH");
}
// main.rs

#![no_std]
#![no_main]

use core::panic::PanicInfo;

#[panic_handler]
fn panic(_info: &PanicInfo) -> ! {
    loop {}
}

// Force marker symbol to be included
// Tell the linker that TLS is being linked to the binary
// Exclusive for x86-64, but `_tls_used` is intentional, rather than `__tls_used`
#[link_section = ".drectve"]
#[used]
static DIRECTIVE: [u8; 19] = *b"/INCLUDE:_tls_used ";

#[link_section = ".CRT$XLB"]
#[used]
pub static TLS_CALLBACK: unsafe extern "system" fn(*mut i8, u32, *mut i8) = tls_callback;

#[inline(never)]
pub extern "system" fn tls_callback(_: *mut i8, _: u32, _: *mut i8) {}

// Required by `no_main`.
// This is the direct entry point. Do not let Rust generate CRT entry for this binary.
#[no_mangle]
extern "system" fn main() {}
kkent030315 commented 3 days ago

C/C++ project can make the binaries much smaller (~1KB), here's yet another practical considerations:

// main.cc

#include <Windows.h>

EXTERN_C unsigned int _tls_index{};

static void NTAPI tls_callback(PVOID, DWORD, PVOID) {}

#pragma comment(linker, "/INCLUDE:_tls_used")
#pragma comment(linker, "/INCLUDE:_tls_callback")

#pragma data_seg(".tls")   
int _tls_start = 0;
#pragma const_seg()

#pragma data_seg(".tls$ZZZ")   
int _tls_end = 0;
#pragma const_seg()

#pragma data_seg(".CRT$XLA")   
int __xl_a = 0;
#pragma const_seg()

#pragma data_seg(".CRT$XLZ")   
int __xl_z = 0;
#pragma const_seg()

#pragma const_seg(".CRT$XLB")
EXTERN_C const PIMAGE_TLS_CALLBACK _tls_callback[] = { &tls_callback, 0 };
#pragma const_seg()

EXTERN_C IMAGE_TLS_DIRECTORY _tls_used = { (ULONG64)&_tls_start, (ULONG64)&_tls_end, (ULONG64)&_tls_index, (ULONG64)&_tls_callback, 0, (ULONG32)0 };

int main() { return 0; }

Compiler and linker flags (use Clang/LLD):

<?xml version="1.0" encoding="utf-8"?>
<Project DefaultTargets="Build" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'">
    <ClCompile>
      <WarningLevel>Level3</WarningLevel>
      <FunctionLevelLinking>true</FunctionLevelLinking>
      <IntrinsicFunctions>true</IntrinsicFunctions>
      <SDLCheck>false</SDLCheck>
      <PreprocessorDefinitions>NDEBUG;_CONSOLE;%(PreprocessorDefinitions)</PreprocessorDefinitions>
      <ConformanceMode>true</ConformanceMode>
      <BufferSecurityCheck>false</BufferSecurityCheck>
    </ClCompile>
    <Link>
      <SubSystem>Console</SubSystem>
      <EnableCOMDATFolding>true</EnableCOMDATFolding>
      <OptimizeReferences>true</OptimizeReferences>
      <GenerateDebugInformation>false</GenerateDebugInformation>
      <IgnoreAllDefaultLibraries>true</IgnoreAllDefaultLibraries>
      <EntryPointSymbol>main</EntryPointSymbol>
      <FixedBaseAddress>true</FixedBaseAddress>
      <RandomizedBaseAddress>false</RandomizedBaseAddress>
      <ImageHasSafeExceptionHandlers>true</ImageHasSafeExceptionHandlers>
      <AdditionalOptions>/MERGE:.data=.text /MERGE:.CRT=.text /MERGE:.tls=.text %(AdditionalOptions)</AdditionalOptions>
    </Link>
  </ItemDefinitionGroup>
</Project>
kkent030315 commented 3 days ago

@m4b Thank you very much for awesome review! All chenge requests addressed. Would you able to take a look for my solution?

can we lower the size of some of the binaries? 110KB feels excessive for a simple repro test is it possible to rewrite some of the binaries as simple C files, compile them, and then if necessary manipulate them so they exhibit the behavior required for testing?

Surely I did! Each binaries now has size of less than 1KB.

alternatively, if only a small subsection is being tested, perhaps we can just do &'static [u8] as an inline byte sequence we expect to parse/not parse correctly, and so unit test just that section of PE you're interested in testing?

Unfortunately, it is not feasible for this TLS tests. Since TLS takes virtual addresses, not the relative virtual addresses, it takes dependencies to the PE header not only for sections.

Maybe somehow we can spread out the impls as possible as it can and feed them small pieces of byte slices, if even 1KB is not ideal.

On the matter of testing in goblin, there was some talk a long time ago to add an external binary suite so that they wouldn't have to be checked in, but i'm not sure how feasible that is, etc., and what it looks like in practice.

I thought this as well. lief-project/LIEF do the similar thing and we definitely should have it.

I guess the best way to do that is to have an external repository (something like m4b/goblin-test-bins) and put the binaries there, then take that repository as a submodule in main repository.

That is absolutely worth try out, as well as some third-party multi-binary repositories such as iosifache/DikeDataset to stress tests. I am doing that way on my local before submitting new significant features like #431/#432 and I had good experience doing that way, found multiple hidden bug sometimes caused by human errors.