m4b / goblin

An impish, cross-platform binary parsing crate, written in Rust
MIT License
1.19k stars 160 forks source link

PE: parse rich header and refactor DOS stub parser #406

Closed kkent030315 closed 3 weeks ago

kkent030315 commented 7 months ago

This PR adds parsing of Rich headers, as someone opened issue #400.

I took the constant bytes in the test code from mthiesen/link-patcher (MIT). If this can potentially be license incompliance, I am happy to make own specimens for testing.

kkent030315 commented 7 months ago

I'm a bit tired, will fix the CI at the rest of today or tomorrow.

m4b commented 6 months ago

@kkent030315 gentle ping; would definitely like to see this go in, but I had requested some changes :)

kkent030315 commented 1 month ago

@m4b Hey, sorry for being shadow. I had do deal with the my life, you know.. anyways, a little bit changes have been applied to fix the code indeed, and the tests cargo test are passed. Would you like to checking this out?

The current codebase in this PR is what I am used/using as an backend parser helper for my own binary rewriting framework that can partiolize the any PE64 binary into object-by-object (aka struct-by-struct, byte-by-byte etc) and this rich header code works perfectly with the 1000+ unique PE64 binaries over the world (regardless of compiler toolchains, including but not limited to MSVC(link), Clang(LLD), MinGW-GCC, NOTE: Only MSVC linker inserts rich header) as well, so it's quite stable for now I believe, while there are something to deal with the coding style you pointed though.

image

I'm going to deal with another my PRs ASAP. I'd like to contribute more on goblin, as theres too many things to get rid of, and some insufficient features.

m4b commented 1 month ago

I'm sorry I knew there was another PR I needed to check before releasing 9.0, I will give this a review, but probably won't be till weekend. Please feel free to ping me then if you haven't gotten any feedback. And thanks for your contributions! :)

m4b commented 1 month ago

Also I hope everything in your life is good; don't feel need to push yourself either, your life comes first :)

kkent030315 commented 4 weeks ago

@m4b Thanks for taking attention on this! I think theres lots of things to discuess here. But I think breaking changes are inevitable for this PR, perhaps how about introducing this feature on 0.10?

kkent030315 commented 3 weeks ago

@m4b Thank you very much for the review! I think it's almost perfect now. Would you able to take a look into the fixes?

kkent030315 commented 3 weeks ago

Maybe something to discuss here https://github.com/m4b/goblin/pull/406#discussion_r1817859854:

We can explicitly isolate DOS stub and Rich header datas, with some mroe complex implementations, but it makes no sense for me to do that.

I can definitely do that work if we should.

kkent030315 commented 3 weeks ago

Self review is done. Looks perfect for me right now.

m4b commented 3 weeks ago

@kkent030315 apologies I wanted to merge the parse_without_dos patch since it was before this, and was waiting a while; so you'll have to rebase, and add a rich_header parameter to the _impl function and pass a None for parse_without_dos is my guess; in meantime I'll do a final review but this looks ready to go just cursorily glancing at it

I appreciate your patience!

m4b commented 3 weeks ago

So I just tried doing this rebase, and it's very annoying/tedious due to all the individual commits; i suggest squashing your patch down to a single commit first, and then rebasing on master to make it easiest.

kkent030315 commented 3 weeks ago

How about now? I resolved all conflicts wtih the upstream and should be able to merge with squash without problems.

kkent030315 commented 3 weeks ago

apologies I wanted to merge the parse_without_dos patch since it was before this, and was waiting a while; so you'll have to rebase, and add a rich_header parameter to the _impl function and pass a None for parse_without_dos is my guess; in meantime I'll do a final review but this looks ready to go just cursorily glancing at it

As of current implementation of rich header parser is always individual regardless of DOS stub, since it takes hardcoded offset of end of DOS stub, so rich header parser still works even if DOS stub is default (no rich header) by Header::parse_without_dos, as I added test for this situation:

    #[test]
    fn parse_without_dos() {
        let header = Header::parse_without_dos(&BORLAND_PE32_VALID_NO_RICH_HEADER).unwrap();
        assert_eq!(header.dos_stub, DosStub::default());
        assert_eq!(header.rich_header.is_none(), true);

        // DOS stub is default but rich parser still works
        let header = Header::parse_without_dos(&CORRECT_RICH_HEADER).unwrap();
        assert_eq!(header.dos_stub, DosStub::default());
        assert_eq!(header.rich_header.is_some(), true);
    }

I don't know if we should not parse rich header when Header::parse_without_dos or we should. Do you have any idea?

m4b commented 3 weeks ago

@ideeockus do you have any opinion on the rich header question?

m4b commented 3 weeks ago

this was an epic PR @kkent030315 thanks for your patience!