rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
97.3k stars 12.58k forks source link

in Windows x86, the symbol name generated by `raw-dylib+undecorated` are not as expected #124958

Open mingkuang-Chuyu opened 4 months ago

mingkuang-Chuyu commented 4 months ago

I tried this code:

#[link(name = "api-ms-win-core-synch-l1-2-0", kind = "raw-dylib", import_name_type = "undecorated")]
    extern "system" {
        pub fn WakeByAddressSingle(address: *const c_void);
    }

In WIndws x86, the symbol name that should be generated should be: __imp__WakeByAddressSingle@4 But, the symbol name of the rust generation is: __imp_WakeByAddressSingle.

When generating the lib, the symbol name should remain __imp__WakeByAddressSingle@4, just set the IMPORT_OBJECT_HEADER::NameType property to IMPORT_NAME_UNDECORATE. Then the linker will automatically convert the name to WakeByAddressSingle.

// Windows SDK(winnt.h)

typedef struct IMPORT_OBJECT_HEADER {
    WORD    Sig1;                       // Must be IMAGE_FILE_MACHINE_UNKNOWN
    WORD    Sig2;                       // Must be IMPORT_OBJECT_HDR_SIG2.
    WORD    Version;
    WORD    Machine;
    DWORD   TimeDateStamp;              // Time/date stamp
    DWORD   SizeOfData;                 // particularly useful for incremental links

    union {
        WORD    Ordinal;                // if grf & IMPORT_OBJECT_ORDINAL
        WORD    Hint;
    } DUMMYUNIONNAME;

    WORD    Type : 2;                   // IMPORT_TYPE
    WORD    NameType : 3;               // IMPORT_NAME_TYPE
    WORD    Reserved : 11;              // Reserved. Must be zero.
} IMPORT_OBJECT_HEADER;
bjorn3 commented 4 months ago

Why are you using undecorated if you need the @4? import_name_type = "undecorated" tells rustc to not generate any prefixes or suffixes to the symbol. If you want them import_name_type = "decorated" (the default) is what you should use afaik.

mingkuang-Chuyu commented 4 months ago

Why are you using undecorated if you need the @4? import_name_type = "undecorated" tells rustc to not generate any prefixes or suffixes to the symbol. If you want them import_name_type = "decorated" (the default) is what you should use afaik.

Normally, the import_name_type attribute should not change the symbol name.

You can check the "synchronization.lib" file of the Windows SDK, the info about WakeByAddressSingle function is as follows:

dumpbin /HEADERS "C:\Program Files (x86)\Windows Kits\10\Lib\10.0.22621.0\um\x86\synchronization.lib"

image

Then we look at the disassembly code when using WakeByAddressSingle, and we can see that the symbol name is still impWakeByAddressSingle@4.

image

Rust's behavior is not standard, and there are security issues.

In another lib file, there happens to be such a function. Because of Rust's non-standard behavior, they will have exactly the same symbol names. This makes it impossible for the linker to distinguish between them, and the linker will only use the first symbol.

In the following code, will rust use the _Test function or the Test function? Why is this so? This is because rust's undecorated doesn't follow the C decorated name rule!

// in rust __imp__Test
#[link(name = "DLL A", kind = "raw-dylib", import_name_type = "undecorated")]
    extern "system" {
        // extern "C" void  __stdcall _Test();
        pub fn _Test();
    }

// DLL A
#pragma comment(linker, "/export:_Test=__Test@0")
extern "C" void  __stdcall _Test()
{
    // in standard, symbol name is `__imp___Test@0` and `__Test@0`
    // in rust(undecorated), symbol name is  `__imp__Test` and `_Test`
}

// DLL B
extern "C" void __cdecl Test()
{
   // in standard, symbol name is `__imp__Test` and  `_Test`
   // Note that the name of the symbol is exactly the same as in the previous rust(undecorated) scene !!!
}
ChrisDenton commented 4 months ago

cc @dpaoliello for this and https://github.com/rust-lang/rust/issues/124956

dpaoliello commented 4 months ago

there are security issues

Can you please elaborate on this? If you are linking against a malicious import library or loading a malicious DLL, then name confusion doesn't provide any benefits to the attacker: they already have arbitrary read/write/execute and can use many other tricks to force their code to be executed.

When generating the lib, the symbol name should remain __imp__WakeByAddressSingle@4, just set the IMPORT_OBJECT_HEADER::NameType property to IMPORT_NAME_UNDECORATE. Then the linker will automatically convert the name to WakeByAddressSingle.

This is a fair criticism, but we'd need to make sure that using NameType produces the same behavior that Rust currently has (i.e., that we can control exactly which function name is loaded at runtime) especially in cases where GCC and MSVC disagree (see the import_name_type MCP). I wouldn't want to change this now without an MCP and a motivating example where the Rust compiler has incorrect behavior.

Rust's behavior is not standard This is because rust's undecorated doesn't follow the C decorated name rule!

Can you please point me to documentation for this "standard" or "rule"?

Also, the intent of import_name_type is to opt-out of normal function name decoration and instead to use the modified name decoration that is documented in Rust's docs: https://doc.rust-lang.org/reference/items/external-blocks.html#the-import_name_type-key.

In the following code, will rust use the _Test function or the Test function? Why is this so?

It calls Test in DLL A.dll because you asked it to call Test without any decorations.

You have shown that Rust produces different import headers than MSVC or Clang does, but you haven't explained why this is an issue. Please remember that the goal of the raw-dylib feature in Rust is that the final binary will load a specific function from a specific DLL without the developer having to provide an import library to the linker. This is a different goal than MSVC and Clang have for their import libraries.

If you can show an example where Rust calls a function that does not match the name per the import_name_type spec in the Rust docs (conflicting symbol definitions? linker depending on SymbolName being set?) then that would be a bug that we could address.

mingkuang-Chuyu commented 4 months ago

@dpaoliello MS decorated-names doc is here https://learn.microsoft.com/cpp/build/reference/decorated-names?view=msvc-170

From the documentation, we can see that raw-dylib+undecorated's symbol name rules are flawed.

For example, how should the following two functions be distinguished between rust?

extern "C" void  __stdcall _Test()
{
}

extern "C" void __cdecl Test()
{
}
dpaoliello commented 4 months ago

@dpaoliello MS decorated-names doc is here https://learn.microsoft.com/cpp/build/reference/decorated-names?view=msvc-170

From the documentation, we can see that raw-dylib+undecorated's symbol name rules are flawed.

That document describes how to decorate names, but if you use undecorated then you are explicitly asking to NOT decorate.

We modeled the feature on the "name type" part of the PE spec, although we implemented this in Rust instead of relying on LLVM (something that I'd be willing to change, although with caution).

For example, how should the following two functions be distinguished between rust?

extern "C" void  __stdcall _Test()
{
}

extern "C" void __cdecl Test()
{
}

I think you are misunderstanding the purpose of this feature. It is not intended to be used all the time for all imports, it is very specifically designed to import functions where the exported name isn't decorated as Rust would normally expect.

So, if you can tell me how those two functions are exported, then I can tell you what #[link] attribute to use.

Examples like this and the original one in your issue are not useful - you need to show where this feature does not meet the behavior documented in Rust's docs, not where it doesn't meet your expectations. If you believe the documented behavior is incorrect, please file an MCP with the behavior that you'd recommend.

jieyouxu commented 4 months ago

Triage: changing this to C-discussion for now, if the behavior is indeed a problem then feel free to change it back to C-bug.

ChrisDenton commented 1 week ago

Given the issue linked directly above it seems like this is an issue, at least for Firefox.

dpaoliello commented 1 week ago

Given the issue linked directly above it seems like this is an issue, at least for Firefox.

Working on a fix...

bjorn3 commented 1 week ago

If we're at it, would it make sense to mangle the symbol name using the regular rust symbol mangling too? We do this for wasm imports too (https://github.com/rust-lang/rust/blob/749f80ab051aa0b3724b464130440b0e70a975ac/compiler/rustc_symbol_mangling/src/lib.rs#L191-L204) and it would ensure that if you import the same symbol from two different dylibs, each call will pick the right dylib to call it from rather than picking whichever import library came first in the linker order. Or if you both define a symbol locally and import it you would still be able to call the imported one (https://github.com/rust-lang/rust/issues/113050).