arandomdev / DyldExtractor

Extract Binaries from Apple's Dyld Shared Cache
MIT License
404 stars 40 forks source link

[dyldex] Add --lookup to quickly find the image an address lives in #43

Closed danzimm closed 2 years ago

danzimm commented 2 years ago

Following on #42 I wanted to optimize the "find the image this address lives in" workflow further. This PR introduces a new flag --lookup which allows the user to supply a hex address and the tool will print out the image it should live in (assuming the images are contiguous in memory, which I think is the case). Example usage:

> dyldex -b dyld_shared_cache_arm64e --look 0x18008e9f8
libdispatch.dylib
arandomdev commented 2 years ago

Unfortunately, the images are not in continuous memory. While this should work for addresses in the text section, it’ll fail for addresses to other sections. The best solution would be to use the containsAddr method of macho context. This would show things down but it shouldn’t be too noticeable. I’ll see if I can do something later.

arandomdev commented 2 years ago

Okay, I looked at this in more detail and I think it would be best to just use your solution. Doing what I suggested would require a bit of refactoring. It might also be good to add type=lambda x: int(x, 0) to the lookup argument, this allows the user to use any base they want.

parser.add_argument(
    "--lookup", type=lambda x: int(x, 0),
    help="Find the library that an address lives in. Prefix the base if needed, for example 0x12345. Really only works for address in the text."
)

I tried to push to your branch but it seems I don't have write access.

danzimm commented 2 years ago

Unfortunately, the images are not in continuous memory

Sad day, but I guess this makes sense. This way I guess they can probably deduplicate more data

Okay, I looked at this in more detail and I think it would be best to just use your solution. Doing what I suggested would require a bit of refactoring. It might also be good to add type=lambda x: int(x, 0) to the lookup argument, this allows the user to use any base they want.

I'm down to have this as a first version, but I'd be interested in working on the refactor if you can expand on what you mean.

Eventually I'd also love to be able to ask the tool to look up all the binaries where a certain string lives, too-- would the above refactors make that possible?

I tried to push to your branch but it seems I don't have write access.

Doh! Is there a way to give you write access automatically? I just pushed my attempt at this change, let me know if it was what you had in mind.

arandomdev commented 2 years ago

FileContext contains both a file object and a mmap object. Right now when it creates a read only MachOContext, it opens a new mmap object instead of reusing the same mmap from DyldContext. I haven't actually tested it with the current setup, but I assume opening that many mmaps can't be too good. I think the best course of action would be to provide a method from DyldContext that creates a MachOContext.

arandomdev commented 2 years ago

Sorry I really messed up with git, your changes are merged.