Open Wall-AF opened 1 year ago
In order to get the output above, I had to make the following 2 changes to the decompiler code:
Replace the line https://github.com/NationalSecurityAgency/ghidra/blob/a9baf9f6d873fcde40b0692af6fb3afbaf5e91c0/Ghidra/Features/Decompiler/src/decompile/cpp/ruleaction.cc#L8412 with
PcodeOp* op2 = vn2->getDef();
if (op2->code() != CPUI_SUBPIECE) {
data.opInsertInput(op, op2->getIn(1), 1);
data.opSetOpcode(op, op2->code());
}
else {
data.opSetOpcode(op,CPUI_COPY);
}
if (op2->code() != CPUI_SUBPIECE) {
if (op2->code() != CPUI_INT_ADD) return false;
if ((Varnode*)0 == op2->getIn(0)) return false;
op2 = op2->getIn(0)->getDef();
if (op2->code() != CPUI_SUBPIECE) return false;
}
Interestingly, if I change the assembly:
10e8:0181 26 c7 47 00c MOV word ptr ES:[BX + 0x4],0x0
04 00 00
to
10e8:0181 26 c7 47 00c MOV word ptr ES:[BX + 0x0],0x0
00 00 00
the code becomes
this->m_pData[nPos].m_nPos_0x0 = 0;
which now picks the zeroth element (1st structure member) of the object in the array. Therefore @caheckman how can I add to my modification to ensure the added constant (0x4
) is interpreted correctly???
NB: Edited because I'd made some incorrect changes to datatypes used here which, once reversed, did exibit the above scenario.
@ryanmkurtz I've just built commit 6b3db0a09 and some of my previously working functions have sprouted the error:
/* WARNING: Stack frame is not setup normally: Input value of stackpointer is not used */
and are now nonsensical.
I've tried dropping the functions entirely and recreating them but I still get the same result. I've even changed the calling convention but no joy there either!
Any suggestions?
Help!!!
I left a message for @caheckman.
I think I've found the issue! in commit af40b28, several PUSH simm8_xx
were changed and their addrsize
spec omitted.
I've changed
https://github.com/NationalSecurityAgency/ghidra/blame/0d71657d052743d37ccbf5eab0e9d4253529a3e7/Ghidra/Processors/x86/data/languages/ia.sinc#L3380
and
https://github.com/NationalSecurityAgency/ghidra/blame/0d71657d052743d37ccbf5eab0e9d4253529a3e7/Ghidra/Processors/x86/data/languages/ia.sinc#L3381
to
:PUSH simm8_16 is $(LONGMODE_OFF) & vexMode=0 & addrsize=0 & opsize=0 & byte=0x6a; simm8_16 { tmp:2=simm8_16; push22(tmp); }
:PUSH simm8_16 is $(LONGMODE_OFF) & vexMode=0 & addrsize=1 & opsize=0 & byte=0x6a; simm8_16 { tmp:2=simm8_16; push42(tmp); }
:PUSH simm8_32 is $(LONGMODE_OFF) & vexMode=0 & addrsize=0 & opsize=1 & byte=0x6a; simm8_32 { tmp:4=simm8_32; push24(tmp); }
:PUSH simm8_32 is $(LONGMODE_OFF) & vexMode=0 & addrsize=1 & opsize=1 & byte=0x6a; simm8_32 { tmp:4=simm8_32; push44(tmp); }
and I'm back!
@ryanmkurtz @GhidorahRex Please advise if this is the correct solution.
Even with the above corrected, @caheckman's change seems to help with simple function parameters (as it creates another pointer of a smaller size) however when that parameter is a pointer to a structure with embeded structures or arrays, we're still struggling.
E.g. the decompiled C
is
iVar1 = 0;
for (uVar2 = 0; uVar3 = (undefined2)((ulong)pParDetLst >> 0x10),
pThis = (DgnParDetailsArray_0xa_t *)pParDetLst,
uVar2 <= pThis->m_nNextFree && pThis->m_nNextFree != uVar2; uVar2 += 1) {
nNthVal = *(int **)((int)((DgnParDetails_0xd2_t *)pThis->m_paParamDetails)->
m_a101Ints_0x6 + nIdx * 2 + iVar1);
SetValidParToValue(*(UINT *)((int)((DgnParDetails_0xd2_t *)pThis->m_paParamDetails)->
m_a101Ints_0x6 + iVar1 + -2),
(DgnDataValue_t *)CONCAT22(unaff_SS,(DgnDataValue_t *)&nNthVal),2);
iVar1 += 0xd2;
}
when looking at the assembly, I believe it should be more like
pThis = (DgnParDetailsArray_0xa_t *)pParDetLst,
for (i = 0; i < pThis->m_nNextFree; i++) {
nNthVal = pThis->m_paParamDetails[i].m_a101Ints_0x6[nIdx];
SetValidParToValue(pThis->m_paParamDetails[i].m_idParm, &nNthVal),2);
}
Where the types involved are in this header file.
And below is the 16-bit Windows NE byte sequence for this function
45558bec83ec02565733ff33f6eb38c45e0626c45f0603de8b460ad1e003d8268b47068946fe6a02168d46fe50c45e0626c45f0603de26ff77049acf14901083c40881c6d20047c45e0626397f0477bf5f5ec94dcb
whose signature is void SetParsToNthValue (DgnParDetailsArray_0xa_t *32 pParDetLst, uint nIdx)
.
Does this help??? And can anything be done to fix this and the like?
I think I've found the issue! in commit af40b28, several
PUSH simm8_xx
were changed and theiraddrsize
spec omitted. I've changed https://github.com/NationalSecurityAgency/ghidra/blame/0d71657d052743d37ccbf5eab0e9d4253529a3e7/Ghidra/Processors/x86/data/languages/ia.sinc#L3380 and https://github.com/NationalSecurityAgency/ghidra/blame/0d71657d052743d37ccbf5eab0e9d4253529a3e7/Ghidra/Processors/x86/data/languages/ia.sinc#L3381 to:PUSH simm8_16 is $(LONGMODE_OFF) & vexMode=0 & addrsize=0 & opsize=0 & byte=0x6a; simm8_16 { tmp:2=simm8_16; push22(tmp); } :PUSH simm8_16 is $(LONGMODE_OFF) & vexMode=0 & addrsize=1 & opsize=0 & byte=0x6a; simm8_16 { tmp:2=simm8_16; push42(tmp); } :PUSH simm8_32 is $(LONGMODE_OFF) & vexMode=0 & addrsize=0 & opsize=1 & byte=0x6a; simm8_32 { tmp:4=simm8_32; push24(tmp); } :PUSH simm8_32 is $(LONGMODE_OFF) & vexMode=0 & addrsize=1 & opsize=1 & byte=0x6a; simm8_32 { tmp:4=simm8_32; push44(tmp); }
and I'm back!
@ryanmkurtz @GhidorahRex Please advise if this is the correct solution.
I think "correct" in this case is "as best as can be done right now". The push immediate instructions rely on the stackAddressSize, not addrsize, to determine how wide of a push. It's fixed based on processor mode, so 16-bit is always 16, 32-bit is always 32. However, our sleigh module doesn't have a parameter for the StackAddressSize. We don't model that, and have no way at present of determining at runtime the base processor stack operand size.
A full fix here is most likely to model the stackoperandsize, or at least determine if the current mode is 16/32/64 bit architecture. Your fix should work most of the time. The 67 addrsize override prefix is valid for the push immediate opcodes, but doesn't do anything, so most push instructions shouldn't have them, but if they did then this would end up producing incorrect results for those instructions, because we also have no way of determining if the 67 prefix was used to modify addrsize.
@GhidorahRex thanks for the explanation, it really helps. Is this something that'll ever get looked into/fixed?
Yes, it's been discussed. In the meantime, I'll try to get this fix in.
@caheckman this change seems to have kindof solved *(undefined2 *)((int)this->m_pData + nPos * 6 + 4) = 0;
but has broken the earlier statement this->m_pData[nPos].m_nPos?_0x0 = this->m_nPos_0xe;
. Now tbf, I've moved on in my analysis and maybe what I now have complicates the situation but I've gone back a little and still the nice/correct array syntax is no more!
Here's my latest incarnation: dgnsrvr_10e8_0133_v2.zip
Please let me know if these examples help or otherwise! Thanks.
@caheckman FYI, seems to be okay with single array defined in the data segment (DS
) and embedded arrays of structues e.g.:
SigProcTD __cdecl16far GetTokSigProc(IDX_ONE_BASED nTok)
{
return ((DgnToken_0x1a_t *)g_tokenMgr_9a5e.m_aDgnTokens_0x0)[nTok + -1].nSigProc_0x4;
}
But we're still seeing additional type casting (above, m_aDgnTokens_0x0
is of type DgnToken_0x1a_t*32
).
@ryanmkurtz Please reopen this as the solution isn't quite correct yet.
@ryanmkurtz @caheckman Please reopen this as the solution as is isn't correct at all.
This dragon_SetParsToNthValue2.zip is another good example with some code in a comment that shows what's really expected.
In the included function dgnsrvr_10e8_0133.zip which decompiles to:
this->m_pData
(declared as a*32
) in the line*(undefined2 *)((int)this->m_pData + nPos * 6 + 4) = 0;
making it almost illegible whilst the line above uses a copy (declared as a*
) and is fine?Ideas please, I'm baffled!!!