NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
51.56k stars 5.87k forks source link

AVR8: Header files processing yields incorrect data type lengths for enums (and other data types) #5077

Open ghost opened 1 year ago

ghost commented 1 year ago

The script provided for parsing AVR8 headers yields enum data types that are 4 bytes in length, per the typical assumption that enums are dword/32-bit integers in most architectures.

This is obviously wrong for AVR8, and these data types need to be reassigned and their length fixed.

A less than cautious user would assign something like a CMD enum to one of the IO addresses, inadvertently thrashing every IO "register" thereafter, up to three at a time.

                             NVM_CMD                                         XREF[5]:    ...
                                                                                          ....
        mem:01ca                 NVM_CMD_   ??

Fixed: image image

Original: image

emteere commented 1 year ago

We have been discussing the sizing of Enums and re-working to re-size correctly correctly.

The CParser could be changed to take into account the data organization and a simple sizing algorithm for the Enum before that change is made. Will look into it as we're in that area. Unfortunately it may not be that simple and would make all other enum parsing possibly incorrect, say on windows, so it needs to be done carefully.

emteere commented 1 year ago

You can also apply enums as labels in the DTmgr. Using the enum value as the address to place the label. On the AVR8 with multiple addressSpaces, that could be cumbersome to use.

Unfortunately it will apply the label to the default memory space which is code. We need to change that. There were thoughts of making an extended "enum", maybe Address in the data type manager which would have all parts (offset, AddressSpace, label).