NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
51.9k stars 5.89k forks source link

Pointer size is affecting array indexing inhibiting proper C output #6805

Open Wall-AF opened 3 months ago

Wall-AF commented 3 months ago

Describe the bug Using sized pointers, say ScriptHandler *32 in a 16-bit app to establish a far pointer, simple arithmetic used to index the array of those objects fails to use the base object size and consequently doesn't recognise the calculation as the index. Something like

   1028:0cb9 8b c2        0c8           MOV        AX,DX
   1028:0cbb 69 c0 02 04  0c8           IMUL       AX,AX,0x402
   1028:0cbf 26 29 47 16  0c8           SUB        word ptr ES:[BX + 0x16],AX
   1028:0cc3 8b 46 06     0c8           MOV        AX,word ptr SS:[BP + this]

turns into *(int *)&pThis->lpScrptHldr_0x16 = *(int *)&pThis->lpScrptHldr_0x16 + iVar4 * -0x402; instead of (if the pointer is sized 16 - the native pointer size) pThis->lpScrptHldr_0x16 = pThis->lpScrptHldr_0x16 + -iVar4; admittedly not perfect, but better!

Expected behavior Array index calculations should only use the pointer size when pointer-to-pointer... (and the index level is NOT at the lowest dereferencing point) is being sought otherwise use its base type size.

Screenshots See bug description.

Attachments None.

Environment (please complete the following information):

Additional context None.

mumbel commented 3 months ago

I recently upgraded a lot of images and I think it's also generally broken right now for a lot of this kind of access. I've noticed a lot of arrays cast as int and then indeed, losing type information (ppc,mips,sparc)

Hopefully fixed by either #6718 or #6722

Wall-AF commented 3 months ago

I recently upgraded a lot of images and I think it's also generally broken right now for a lot of this kind of access. I've noticed a lot of arrays cast as int and then indeed, losing type information (ppc,mips,sparc)

Hopefully fixed by either #6718 or #6722

I haven't updated to include those yet!

mumbel commented 3 months ago

They're not merged, I haven't tried either yet in a local build to see if it fixes the issues I've seen, but looks promising

Wall-AF commented 1 month ago

I recently upgraded a lot of images and I think it's also generally broken right now for a lot of this kind of access. I've noticed a lot of arrays cast as int and then indeed, losing type information (ppc,mips,sparc) Hopefully fixed by either #6718 or #6722

I haven't updated to include those yet!

No dice! Tried both fixes together and independently to no effect.

I believe the 2 sizes of pointers is what really needs fixing for my issues. E.g. if a structure contains a pointer and that pointer stores the segment and offset that turns it into a 4-byte pointer which then interferes with Ghidra's internals as they cannot cope with 2 sizes! Is there any hope for that to be investigated?

Wall-AF commented 1 month ago

Maybe the 2 screenshots below illustrate the situation. In the first, pFld0x2->field4_0xa is defined as a pointer of size 32-bits and the second uses the default size (as per cspec (16-bit)). image image