Open XVilka opened 1 year ago
Hi. I would like to work on this issue. I think I have got an idea on how to resolve this.
Quite often compilers add a special section
.ARM.attributes
that has that information (note theTag_CPU_arch_profile
andTag_CPU_arch
attributes)
Hi. Just to be clear, is our intention to simply recognize the cpu profile (eg: A, M, R, etc) or the specific processor family (eg: cortex, neoverse, etc.) that the elf is expected to run on?
Based on what I have understood after reading through ARM's addenda to their ABI and this wikipedia page on the list of ARM processors, it's quite clear that the "M" profile implies the cortex-m processor family or a similar family (like SecurCore) which shares the same features.
However, the "A" cpu profile could imply the cortex-a family or the neoverse family.
I noticed the following struct in librz/asm/p/asm_arm_cs.c
:
RzAsmPlugin rz_asm_plugin_arm_cs = {
.name = "arm",
.desc = "Capstone ARM disassembler",
.cpus = "v8,cortexm,arm1176,cortexA72,cortexA8",
.platforms = "bcm2835,omap3430",
.features = "v8",
.license = "BSD",
.arch = "arm",
.bits = 16 | 32 | 64,
.endian = RZ_SYS_ENDIAN_LITTLE | RZ_SYS_ENDIAN_BIG,
.disassemble = &disassemble,
...
}
The cpus
field is hard coded to a specific processor (eg: cortexA8) or a family (eg: cortexm). How do I go about dealing with other families such as Neoverse
?
@valdaarhun for now detecting profile is enough, but since Rizin ARM decoding is based on Capstone, only those make sense for autodetection (https://github.com/capstone-engine/capstone/blob/next/include/capstone/arm.h#L1638):
// Architecture-specific groups
// generated content <ARMGenCSFeatureEnum.inc> begin
// clang-format off
ARM_FEATURE_IsARM = 128,
ARM_FEATURE_HasV5T,
ARM_FEATURE_HasV4T,
ARM_FEATURE_HasVFP2,
ARM_FEATURE_HasV5TE,
ARM_FEATURE_HasV6T2,
ARM_FEATURE_HasMVEInt,
ARM_FEATURE_HasNEON,
ARM_FEATURE_HasFPRegs64,
ARM_FEATURE_HasFPRegs,
ARM_FEATURE_IsThumb2,
ARM_FEATURE_HasV8_1MMainline,
ARM_FEATURE_HasLOB,
ARM_FEATURE_IsThumb,
ARM_FEATURE_HasV8MBaseline,
ARM_FEATURE_Has8MSecExt,
ARM_FEATURE_HasV8,
ARM_FEATURE_HasAES,
ARM_FEATURE_HasBF16,
ARM_FEATURE_HasCDE,
ARM_FEATURE_PreV8,
ARM_FEATURE_HasV6K,
ARM_FEATURE_HasCRC,
ARM_FEATURE_HasV7,
ARM_FEATURE_HasDB,
ARM_FEATURE_HasVirtualization,
ARM_FEATURE_HasVFP3,
ARM_FEATURE_HasDPVFP,
ARM_FEATURE_HasFullFP16,
ARM_FEATURE_HasV6,
ARM_FEATURE_HasAcquireRelease,
ARM_FEATURE_HasV7Clrex,
ARM_FEATURE_HasMVEFloat,
ARM_FEATURE_HasFPRegsV8_1M,
ARM_FEATURE_HasMP,
ARM_FEATURE_HasSB,
ARM_FEATURE_HasDivideInARM,
ARM_FEATURE_HasV8_1a,
ARM_FEATURE_HasSHA2,
ARM_FEATURE_HasTrustZone,
ARM_FEATURE_UseNaClTrap,
ARM_FEATURE_HasV8_4a,
ARM_FEATURE_HasV8_3a,
ARM_FEATURE_HasFPARMv8,
ARM_FEATURE_HasFP16,
ARM_FEATURE_HasVFP4,
ARM_FEATURE_HasFP16FML,
ARM_FEATURE_HasFPRegs16,
ARM_FEATURE_HasV8MMainline,
ARM_FEATURE_HasDotProd,
ARM_FEATURE_HasMatMulInt8,
ARM_FEATURE_IsMClass,
ARM_FEATURE_HasPACBTI,
ARM_FEATURE_IsNotMClass,
ARM_FEATURE_HasDSP,
ARM_FEATURE_HasDivideInThumb,
ARM_FEATURE_HasV6M,
As rizin doesn't have a way to select particular features, only CPUs with sets of particular features are possible for now.
cc @Rot127
@valdaarhun if you check disasssemble()
function in the librz/asm/p/asm_arm_cs.
you will see that only CS_MODE_MCLASS
and CS_MODE_V8
are used. Thus, it's fine to detect just those for now.
I see. In that case, I'll just focus on these two classes.
Hi. The functions get_cpu_mips
or get_cpu_arm
in librz/bin/format/elf/elf_info.c
simply print the cpu name. How do I get rizin to actually make sense of it before disassembly?
In librz/arch/p/asm_arm_cs:disassemble()
, it checks the value of a->cpu
. I am guessing it needs to figure out a way to set a->cpu
to "cortexm" or "v8". But where is this actually set?
When rizin is run with -e asm.cpu=cortexm
, it calls rz_config_eval()
. I think this sets the value in r->config
. Should I use the same/similar approach in get_cpu_arm()
?
Hmm, I thought this value is used somewhere, my bad. Ok, you need to pass it to the config somehow, yes. It's probably should be done somewhere in librz/core/cbin.c
Thank you for your response. I'll take a look at cbin.c
.
@valdaarhun Sorry, I missed the mention above from @XVilka. It's fine, if for now it can only check for armv8
or the M-profile. Although, please ensure it is easily extendible. So when we add toggles for all the other CPU features (e.g. see list above), it takes only minimal effort.
In the best case implement your solution only for armv8
and add coretx-m
toggle afterwards. So you can check if it is actually easy to add a feature.
It is common to have ELF for ARM Cortex-M profile but it's not shown in the ELF header:
But the CPU profile can affect analysis drastically in the case of ARM Cortex-M, for example, because of additional instructions, and being Thumb, it has some effect on the sequence of disassembly.
We should figure out a way to detect Cortex-M ELFs whenever possible. Currently you have to specify it from command line:
Would be nice to autodetect cortexm/cortexa profiles whenever possible.
Quite often compilers add a special section
.ARM.attributes
that has that information (note theTag_CPU_arch_profile
andTag_CPU_arch
attributes):See https://stackoverflow.com/questions/70071681/how-can-i-know-if-an-elf-file-is-for-cortex-a-or-cortex-m for more information
It should be changed somewhere probably in
librz/bin/format/elf/
.See file
librz/bin/format/elf/elf_info.c
andget_cpu_mips()
function as an example.