Open Wall-AF opened 1 year ago
Here's yet another (simpler) example where the local variable pData
is incremented to get to the next element (instead of using the loop counter as the array index).
FYI, in the initial attachment, there's another example where the parameter ((&((DgnPhoneme_0x4_t *)DgnPhoneme_0x4_t_ARRAY_1210_9b34)->b0 + iVar4)
) in the function call below
pDVar5 = getPicPhonemeData((DgnPhoneme_0x4_t *)
CONCAT22((char *)"User_Iterate" + 10,
(DgnPhoneme_0x4_t *)
(&((DgnPhoneme_0x4_t *)
DgnPhoneme_0x4_t_ARRAY_1210_9b34)->b0 +
iVar4)));
is displaying the same/similar problem!
With the segment issues fixed, it should be decompiled as (something akin to)
pDVar5 = getPicPhonemeData(DgnPhoneme_0x4_t_ARRAY_1210_9b34[nItem]);
This might help with your diagnosis as it has arrays that display well (where a value is extracted) and others that don't (I think due to needing their address and it being the result of a calculation) like
local_12 = pThis->field10_0x308 + local_8;
it shoud be (I believe)
local_12 = &pThis->field10_0x308[local_8];
I forgot to say that in this case as local_8
is incremented by 1
- and that is the array element size - the compiler didn't require a second variable to add to the base object as is the norm for arrays of elements of size >1.
Here is a very short function that displays this issue in it's return argument!
Just now I created an all-encompassing structure from a series of memory locations in my data segment and prior to this Ghidra was able to output a reference to an array using the syntax:
((byte *)aBytes)[i] = 0;
Now I get:
*(undefined *)((int)g_Obj0x47a_DS_9e86.aBytes + i) = 0;
It seems that Ghidra's lost it's knowledge of the byte array
just because it is now an element of the structure! How can this be fixed?
The highlighted section in the image below shows what should be a simple assignment of a newly realloc
'd pointer to an object of size nMaxWs * 2
bytes (which is actually an array of size nMaxWs
of type (u)int
), but is split into two (overcomplicated assignments) just because the return of functions use two 16-bit registers instead of one 32-bit register.
I've used a comment to show the better syntax that Ghidra needs to be capable of producing and an illegal variable name apUnodes[i]
to try to help describe what's going on!
This kind of missing simplification by Ghidra gets even worse when in another function the address of a particular element of the realloc
d array is required!
Can this kind of issue be overcome?
I believe the highlighted section of the image in the previous comment is incorrect and it should be
That mistake just illustrates the difficulties trying to correctly interpret this kind of output!
Included here is a follow-up function that uses the allocted memory stored in those pointers. By using illegal variable names I've annotated most of the indexing where possible and in comments otherwise.
An interesting compiler optimisation within this example function is that a couple of local variables are created based upon the address of the first element (beginning of) the this
structure. These are then used to point to differently sized (int
vs long
) array objects embedded within the this
structure and are incremented by their appropriate size. (The two examples I've named pThis2Offset
&pThis4Offset
.) Can this kind of usage be understood and used to produce better C
output?
Describe the bug In a loop looking at elements of an array, Ghidra isn't realising that a
local
variable added to an address is actually equivalent to looking at the same field in the next(/previous) array element. (It doesn't seem to be checking thelocal
variables increment (or decrement) against the size of the array's element!) This is producing expressions like(*(char *)((int)&DgnPhoneme_0x4_t_ARRAY_1210_9b34[0].b2 + local_4) == '-')
instead of(DgnPhoneme_0x4_t_ARRAY_1210_9b34[nItem].b2 == '-')
wherenItem
is the loop index.With nested arrays, it gets worse!
To Reproduce Steps to reproduce the behaviour:
Decompile:
PanelsDebug Function Decompilation
menu)Expected behaviour Use of correct(/better) array syntax in decompiled C output.
Screenshots N/A
Attachments dragon_FUN_1160_0049.zip
Environment (please complete the following information):
Additional context This is an extract from a Windows NE DLL (circa 1990s)