pkgw / elfx86exts

Decode binaries and print out which instruction set extensions they use. This program's name is a lie: it supports not just x86/ELF but also ARM64, MachO, and possibly more.
MIT License
206 stars 13 forks source link

Add Arm64 support #167

Closed jasonmccampbell closed 11 months ago

jasonmccampbell commented 1 year ago

Hi,

This utility has turned out to be quite useful in quickly figuring out why a given binary doesn't run on a particular processor. However, we've also run into the same issue with Arm processors. I saw that it's based on capstone so there wasn't a particular limitation to only x86, so I added PR #166. Is this of interest? If so, I'll clean up the PR for a true submission.

Jason

pkgw commented 1 year ago

Thanks for reaching out! This program is basically in low-level maintenance mode for me, but if it would be helpful to you for it to include this functionality, I'm happy to merge something.

Looking at the draft PR, the top-line thing that I notice is that in my view it would be nice if the program could parse both x64 and arm64 binaries regardless of what target platform it's being built for — my impression based on a quick scan was that you have the different modes gated based on the build target platform. But the ability to disassemble and analyze different kinds of binaries shouldn't (have to) depend on what platform you're actually running on, and it seems like the flexibility could come in handy. But if that proves to be tricky to implement, or if it would make the resulting binary a lot heavier, I'm fine with ditching that idea.

Thanks again for your interest!

Peter

jasonmccampbell commented 1 year ago

Hey Peter,

Thanks for the quick response. Yes, if you are open to it, I'll clean this PR up and get it something closer to being ready to merge.

I agree handling multiple architectures in a single build is a nicer experience. I'll add a cmdline switch to select between them. I also need to see if I can reconcile the instruction set features vs. the ones reported in std_detect. I haven't looked at whether this is possible or not.

Jason

pkgw commented 1 year ago

Is a command-line switch even necessary? In my ideal world, the program would just do its best with whatever file you give it — choose how to report based on whether it's x86-64 or arm64, and error out if it's anything else.

jasonmccampbell commented 1 year ago

Yes and no? I didn't see any way to do it using Capstone as new_raw always requires the instruction set to specified up-front. On a quick look, I didn't see another option. read-elf could, though it might break Mach-O.

I updated the PR with a command-line switch so all of the platform-specific bits are out except for the default architecture.

pkgw commented 1 year ago

Hmm, OK. Well, I'll take my follow-ups to the PR since I think that's a better place for this kind of implementation discussion.

jasonmccampbell commented 1 year ago

The Arm detection works but less well than I was hoping. It will detect the instruction set groups from Capstone just fine, but the Capstone groups are instruction set extensions and don't cover the various versions of the instruction set. For example, the CASA (compare-and-swap) instruction is part of ArmV8.1 and isn't available on older processors. CASA is not tagged for any group, so the utility won't distinguish a binary that will or will not run on an older processor. I need to see if there happens to be data around to build a more complete mapping but I'm not that hopeful that it will be in-scope.

jasonmccampbell commented 1 year ago

That's awesome! I was reading through Capstone and am familiar with ReadElf, but hadn't seen object before. Yes, please go ahead and update the PR, that's a great solution. The code is settled for now as I haven't found a good way to which instruction set version is required yet, so I need to dig through Capstone a bit more to see if there is something useful.

pkgw commented 11 months ago

I guess this one can be closed!