Vector35 / binaryninja-api

Public API, examples, documentation and issues for Binary Ninja
https://binary.ninja/
MIT License
920 stars 207 forks source link

Windows kernel type libraries #2913

Closed netadr closed 10 months ago

netadr commented 2 years ago

What is the feature you'd like to have? Typelibs for Windows kernel mode libraries

Additional Information: Having type information for common kernel mode libraries, such as:

would make it easier to analyze drivers that link to these components.

emesare commented 2 years ago

I think this is solved through the upstream source issue https://github.com/microsoft/wdkmetadata/issues/1

plafosse commented 2 years ago

Yes that's the easiest solution. There are other more difficult solutions.

apekros commented 2 years ago

Sad to see this won't be in 3.2? Considering it's relevancy, this is a much needed feature and I feel a lot of people would appreciate and are waiting on it!

plafosse commented 2 years ago

I 100% agree with this, and we're very sorry it's not going to make the 3.2 release. There were two ways we could have solved this issue: 1) https://github.com/microsoft/wdkmetadata/issues/1 this issues would allow us to re-rerun our generation script and produce new type libraries. As of this writing this issue is still open. 2) Extract type libraries from PDBs, the problem here is that the recent PDB changes landed much later than we anticipated, not giving us the time needed to generate these.

This issue is still a high priority for us and we'll be focusing a bunch of type libraries in the next release.

emesare commented 1 year ago

Upstream WDK metadata has been published https://github.com/microsoft/wdkmetadata

plafosse commented 1 year ago

Yup already have my eye on it. This ticket is currently slated as high priority for the up coming release.

hugsy commented 1 year ago

FWIW Since the fantastic new header import GUI I've had great success importing the SDK/DDK from official (MS) and unofficial (ProcessHacker NT Headers) sources. Something like:

image

IMO this approach is way better than typelibs because it allows us to control through the macros the DDI version we wanna use. Only downside I have currently is the (crazy) long analysis time, but it's a one-time thing (2-time actually, once on load, once after importing the headers but still).

So great work on this to the team, I finally don't need IDA any longer for my own project (and am pushing at work to make the switch) ❤️ 🥷

psifertex commented 1 year ago

You might also consider using "analysis hold" to not do that initial analysis and only let the full analysis continue after you've imported the type lib. That might actually improve the overall time for that workflow? Just use open with options and select analysis hold.

Kharos102 commented 1 year ago

Adding to this that the header import currently has issues with padding.

For example, running the command by hugsy above results in the definition of the IO_STACK_LOCATION structure with padding between the first two fields OutputBufferLength and InputBufferLength, when there actually isn't supposed to be any padding here.

It also says the first field OutputBufferLength is at offset 0x8, when it should be 0x4.

CouleeApps commented 1 year ago

It also says the first field OutputBufferLength is at offset 0x8, when it should be 0x4.

Looking at the header I have (wdm.h from 10.0.19041.0), I think those offsets are actually correct. Vergillus has it at that offset too, so I think the parser is accurate. Best guess is that the other structures inside the union have alignment 8 (they start with pointers), so the whole union gets aligned to 8 bytes. And unless this specific member has special casing, I'm inclined to believe it is correct as-parsed.

Other best guess is that you were expecting x86 (not x64) alignment, in which case there is no padding and it is at 0x4. If you want x86 parsing of the header, you can swap -D_AMD64_ for -D_X86_ and then you get the behavior you mentioned. Or you've opened an x64 binary and are trying to parse the header as x86, in which case you'll need to explicitly set the target in clang via --target=i386-pc-windows-msvc

Kharos102 commented 1 year ago

You're right.

Actually appears the UI threw me off.

This is how its shown in HLIL UI: int32_t IOCTL = pIoStackLocation->Parameters.DeviceIoControl.__offset(0x10).d

This is how its shown in disassembly: mov eax, dword [rcx+0x18 {_IO_STACK_LOCATION_IOCTL::Parameters.DeviceIoControl.IoControlCode}]

Somehow, the disassembly is much clearer in what is being accessed (IoControlCode) instead of HLIL.

In fact, I read HLIL as saying offset 0x10 from the start ofDeviceIoControlis being accessed), but its not, its offset0x18`.

The only way I can read 0x10 as being a valid offset, is if I start at the end of the struct (which is offset 0x28) and subtract 0x10 to land at the field IoControlCode.

Am I reading this wrong?

Note: For display purposes I change the IO_STACK_LOCATION type to have the DeviceIoControl union entry at the start, e.g.:

struct _IO_STACK_LOCATION_IOCTL
{
    UCHAR MajorFunction;
    UCHAR MinorFunction;
    UCHAR Flags;
    UCHAR Control;
    union
    {
        struct
        {
            ULONG OutputBufferLength;

            ULONG InputBufferLength;

            ULONG IoControlCode;
            PVOID Type3InputBuffer;
        } DeviceIoControl;
CouleeApps commented 1 year ago

My guess is a lot of the confusion comes from the lack of union type support (see #1013), which is a known limitation that, while we have plans to address eventually, has proven significantly more difficult than first expected. You can probably cheat around this lacking by splitting up the giant union type and using each of the individual sub-members within, but it is not a clean solution and is certainly not as smooth as it should be.

As for your assumption, it's probably giving you the offset from the start of the inner union, and just having difficulty looking up the name of which member specifically that pertains to (IoControlCode is 0x10 from the start of DeviceIoControl, so __offset(0x10) seems to suggest that).

Kharos102 commented 1 year ago

Ah also correct, it is 0x10 from the start of it.

The union support thing is an interesting note, since it appears the disassembly has no problems resolving the correct name _IO_STACK_LOCATION_IOCTL::Parameters.DeviceIoControl.IoControlCode.

Otherwise yes, I imagine the eventual union support work would cover this. Just wanted to note the information is actually there already and displayed in other views.

CouleeApps commented 1 year ago

That's not too surprising, given that each level of IL has a different way of rendering member access lookups (MLIL and HLIL actually use references to the type directly, vs LLIL and disasm which don't know about types and just annotate guesses by offset). You may get good mileage out of having a disasm pane open at the same time if it is able to resolve in a way that you can use.

plafosse commented 10 months ago

Added in 3.6.4708