x64dbg / snowman

Snowman Decompiler for x64dbg (LOOKING FOR MAINTAINER)
http://derevenets.com/
266 stars 55 forks source link

XMM Decompiling #6

Open JohnnyErnest opened 7 years ago

JohnnyErnest commented 7 years ago

I'm not really sure how to categorize this issue.

I have a fairly long mathematical conversion function in an x86 program with PDB symbols that I want to disassemble and analyze to verify its accuracy. It makes heavy use of the xmm registers, lots of mulsd/addsd/divsd/subsd/cvtps2pd/cvtpd2ps calls, does some intermittent storing and retrieving from ds:[eax] and the stack, and a few calls at the end that do some asin/atan2 functions using fld/fstp/fmul. 95% or so of this function is using xmm registers, and it's a very long one, maybe close to a thousand lines.

If I attempt to decompile the selection using the Snowman plugin, it basically only outputs the asm calls and doesn't bother with anything to do with the xmm registers, it also doesn't bother going into any calls, so I can't use it here. Is this expected behavior?

So I basically went the route of manually commenting each line because let's say my first input float goes into xmm0 then it gets converted to a double in xmm2, back to a single in xmm1, xmm2 gets multiplied by xmm3, and that xmm1 may not get used until way later in the function, just as an example. I spent a good hour or so manually analyzing and commenting, went to make a graph, and the graph was big enough that when I saved the image it crashed x32dbg. When I came back, the database must not have persisted all of my comments were gone.

I'm wondering is it possible to more easily follow where data from a register has gone as far as being copied? I'm guessing a Watch wouldn't work for me in this case, and it looks like I can't get it to use floats or doubles anyway. Let's say it started in ds:[eax], got copied to xmm0, converted to double to xmm1, etc etc, maybe it ends up in xmm7, and doesn't get used for a hundred lines or so. I basically want to follow the path of ds:[eax] and maybe see some kind of listing of where all its been used. Is such a thing possible?

(Just to clarify I'm not talking about watching ds:[eax], but where all it gets copied, any operation that uses it and recursively any operations that are the result of operations performed on it or copies of it. That way when reviewing my function I say here are any lines that have used the X variable that came from ds:[eax], Y variable from ds:[eax+8], etc.)

mrexodia commented 7 years ago

Snowman is a separate project that has been integrated to x64dbg, it has no support for XMM (although you could go about adding it to here: https://github.com/yegord/snowman).

Do you have a minidump for that crash? Generally x64dbg only saves the database on exit or if you use the dbsave command manually. Probably you could write a small plugin to automatically comment the xmm stuff based on the opcode and the operands.

Basically what you want is taint analysis, which is not implemented in x64dbg, you can use the highlighting mode (press H) and click an xmm register to help you view things, but there is no automated tool that does this.

mrexodia commented 7 years ago

@ShiningKnightLight can you send the text of the function when copied with Ctrl+C? I'll see if I can help you understand it with some of the projects I'm working on.

JohnnyErnest commented 7 years ago

It's a bit of a big function, and it splits in a lot of places, it's the Quaternion.eulerangles function from Unity3D (converts from Quaternions to Euler angles). I wanted to check it to see what the underlying math does because a lot of people are confused on its function as is. Most of the time devs just accept it and use Unity's functions, and that's fine until you start interacting with hardware that sends or receives rotational coordinates like an IMU and it's in a different coordinate system like Nautical where Z is up vs. in Unity and most OpenGL-based software where the default viewport sets Y to up by default. In many cases, it's just simply a matter of rotating on the X axis by 90 degrees, but maybe I have situations where I want to just work in Quaternions and not even mess with Euler altogether, it would be great to know all the underlying math.

Basically, I just wanted to understand the math and know why it's a bit different than other more widely known Euler to Quaternion and vice versa conversion functions.

Luckily I can build an EXE with PDB symbols directly from Unity and debug it. Mathematically, it looks sound, unless your rotation pitch angle is 90 or 270 degrees, then they have some weird logic going on, and some of it I found out yesterday is happening outside of the function (in this case, when X = 90 or 270 and you convert to Quaternions, it looks like it multiplies the X/Y/Z/W by -1, but it doesn't happen until sometime after the Euler -> Quaternion function and before the Quaternion -> Euler function).

What I went and did was go back through and comment and just export the DB often just in case of crash. Unfortunately I don't think I have the minidump, if I can repro again I'll try and find it.

Here's the Euler to Quaternion function, this one's pretty straight forward: https://pastebin.com/ikW8c4AT

Here's the Quaternion to Euler function, this one I've commented a lot, it's the trouble function: https://pastebin.com/AGVa3Pxc

I kind of figured that was the case on Snowman, but wanted to check. I'll start looking into taint analysis, I figured maybe a plugin will need to be developed. In the comments you'll see I tried to kind of work out what it's doing mathematically and keep tabs on the formulas. Looks like the majority of it is checking out so far, except for whenever X = 90 | 270 and you have other positive Y/Z angles. Then I realized that say your Euler to Quaternion gives you something like (0.7, 0, 0, 0.7), whenever X is 90 or 270 it would come into the function like (-0.7, -0, -0, -0.7), so I'm thinking the problem is upstream, but that doesn't make any sense if you have a standard mathematical function that you're reusing in several places in code why you would alter the variables prior to entering the function, so surely I must have made a mistake somewhere in my comments, but haven't been able to track it down yet.

mrexodia commented 7 years ago

I didn't actually forget about this. I'll share some of my code later...