m4b / goblin

An impish, cross-platform binary parsing crate, written in Rust
MIT License
1.17k stars 156 forks source link

Write support for PE binaries #361

Closed RaitoBezarius closed 9 months ago

RaitoBezarius commented 1 year ago

Write support for PE binaries in a basic form

This implements reserialization for many in-memory structure with PE semantics. It does not yet enable you to modify a PE and write it because you will need many modifications to the existing structure to avoid breaking the PE.

But you can already modify existing bytes this way.

I learned the hard way that this code may be insufficient to write the worst PE binaries out there (look the Corkami repositories on pocs and look for the PE there), I deem this to be out of scope for now as it is very hard to get all of this right.

Also, I don't think that identity rewrite should be a goal for anything else than "basic" and "simple" (read: nice) PEs, which can be accomplished in this PR with NixOS PE binaries for UEFI (and I tested a hundred of them).

Testing ways:

cargo run --example rewrite_pe /boot/EFI/Linux/some-uefi.efi /tmp/uefi.efi

As a follow-up, I will work on:

TODO:

Out of scope :

Relevant issues/PRs:

https://github.com/m4b/goblin/issues/277 https://github.com/m4b/goblin/issues/357

RaitoBezarius commented 1 year ago

@m4b you did say you didn't want more dependencies in goblin, are you okay with some of them in examples code? (to showcase how to use external crates for example?)

RaitoBezarius commented 1 year ago

FWIW, with the last commit here, if I run rewrite_pe on the NixOS UKI provided, I have the following result:

written as /tmp/nixos-uki.efi
original PE size: 105224 bytes, new PE size: 105792 bytes, delta (new - original): 568 bytes
original signatures: 1, new signatures: 1
RaitoBezarius commented 1 year ago

The delta being: image

RaitoBezarius commented 1 year ago

Okay, this is now fixed.

RaitoBezarius commented 1 year ago

@m4b What do you expect from me for the testing strategy of this PR? Should I add various PE binaries or just some binaries and test rewriting them or test writing binaries from scratch for some features?

I feel like getting this feature to perfect polish will take some hoops and I am wondering if we can merge a "beta" version and stabilize it further.

m4b commented 1 year ago

@m4b What do you expect from me for the testing strategy of this PR? Should I add various PE binaries or just some binaries and test rewriting them or test writing binaries from scratch for some features?

I feel like getting this feature to perfect polish will take some hoops and I am wondering if we can merge a "beta" version and stabilize it further.

I'm ok with merging a beta version. I think I'd like to do a release before that, this way you can work off of master and hopefully have a full featured version ready for a crates release?

m4b commented 1 year ago

And re binaries and testing this has always been a source of tension. For now I'd recommend checking in u8 slices in test code and not binaries as we have in other places.

m4b commented 1 year ago

@RaitoBezarius friendly ping/triage wondering what work is left to take this about of WIP and getting something ready to merge? :)

RaitoBezarius commented 1 year ago

@RaitoBezarius friendly ping/triage wondering what work is left to take this about of WIP and getting something ready to merge? :)

I decided with @baloo to take this PR for a super crash test in https://github.com/RaitoBezarius/goblin-signing/tree/signed-data where we start signing binaries arbitrarily combining everything. I guess it will be a good test if I can wire this down to our integration testing with UEFI.

m4b commented 1 year ago

awesome sounds good to me and let me know how it goes; lots of good work here, don't want it to get stale :)

Lunarequest commented 1 year ago

is there any update on this? whats is left to be done(other then rebasing)

RaitoBezarius commented 1 year ago

Life has been busy and I didn't want to fuck up this by merging too early, all is left is testing two usecases "end to end":

I feel like there should be like a final push of work, one week or something, but I am on vacations and just after I will be busy under conferences, hopefully, I aim to close this before the end of September.

m4b commented 1 year ago

Great thank you for the update !

m4b commented 1 year ago

And looking forward to seeing this merged if wasn’t clear, thanks @Lunarequest for pinging in status !

RaitoBezarius commented 11 months ago

I'm soon there, I want to rework the commit history, look for an easy basic testing strategy and I want to go for the merge then.

RaitoBezarius commented 11 months ago

Planned follow-ups:

RaitoBezarius commented 11 months ago

Simple way to test:

$ cargo run --example rewrite_pe /boot/EFI/Linux/some-uefi.efi /tmp/uefi.efi
RaitoBezarius commented 11 months ago

@m4b We are ready for review, I advise to go commit per commit. :)

RaitoBezarius commented 11 months ago

@m4b I cannot find a small enough PE binary to create a simple identity rewrite test, as I cannot check one, I don't think it's a good idea to check in 118kb of &[u8] I guess… Let me know if this is a blocker.

RaitoBezarius commented 9 months ago

May I bother you again @m4b ? I am all ready for merge and further improvements.

RaitoBezarius commented 9 months ago

thank you for the example! I had some questions I asked, major reason for request changes is i I don't want strtab bytes pub. I also don't understand how the dos header has all those extra fields added before the pe pointer field; what were we parsing before?

I will remove the pub on the strtab for now as I don't want to make this review longer or too annoying for you, we can always revisit how to serialize COFF elements like the strtab I think.

RaitoBezarius commented 9 months ago

I'm also moving the entrypoint of the PE Pwrite implementation to a further PR, I'm not very happy with it and I can revisit the offset breaking change in DataDirectory this way.

m4b commented 9 months ago

This looks good; looks like a CI failure, but otherwise basically ready to merge; I'm going to go to sleep though :)

m4b commented 9 months ago

@baloo did you want to review? Otherwise I'm going to merge in an hour or so

baloo commented 9 months ago

I've tried to help @RaitoBezarius on this (he still deserves 120% of the credit here! But we've been bouncing off ideas on matrix). The code looks good to me.

RaitoBezarius commented 9 months ago

This is really amazing work @RaitoBezarius thank you so much! And thank you for committing the example too; tbh i'd really prefer not to have dev dependencies introduced, but it's fine I guess. I'll let this sit for a few hours, but otherwise I'm ready to merge, great stuff.

Thank you so much, the next PR #381 is going to be also very good. :-) stderrlog is not necessary per se, but I am not aware of an easier way to make the logs of log appear on stdout easily, we can remove it if you prefer.

So as far as "feature completeness", would the PE pwrite implementation allow, e.g., writing out PE object files for a toy compiler backend, similar to what faerie did?

So right now, it only focuses on PE executables, not PE object files (i.e. COFF), technically, if you had a PE object files library, you could write your COFF files and use this to assemble the final PE executable.

In the future, if someone adds COFF tables support (strings, symbols, etc.), you could probably write PE COFF files by filling the structures yourself, though, for an elegant and nice compiler, I recommend following the methodology in #381 where you have a meta structure PEBuilder or COFFBuilder and you assemble the final PE which you can write out via Scroll.

You need a lot of "layout"-ing when building such things from scratch, which can be a PITA when you want to focus on just giving a high level representation of your binary (here's my sections, this one is executable, attach some certificates here, etc.)

I am not planning to do the COFF thing in the near future because it creates a certain higher level of complexity and all I need is the PE executable stuff, but, I may come back later for this as I wonder if LLVM may someday adopt Rust and this library would probably be handy for a real compiler. :)

RaitoBezarius commented 9 months ago

And @baloo more than helped me! He stayed with my ramblings about PE and we fixed a lot of stuff, if anything, he's definitely a co-author for me :).

RaitoBezarius commented 9 months ago

Whew! What a PR, thank you @m4b for your patience. :)

m4b commented 8 months ago

released in 0.8.0, happy new year!