Closed jerc33 closed 2 years ago
The suggested examples for how operand printing could look like is in my opinion too close to Intel x86 syntax. This has some problems
Some alternative approaches:
src
and dst
fields. I don't think they are currently available using rzpipe and commands. See declaration of RzAnalysisOp and RzAnalysisValue . This in my opinion would be the closest to approach suggested above but without details of specific architecture disassembly syntax. And rizin already has this information.RzAnnotatedCode
or something similar which attaches structured meaning to the ranges of printed disassembly text. This approach would be more suitable for interactive tools making UI on top of rizin. Like visual mode, less hacky way for replacing names within disassembly mov qword [rsp + 0x38], rax
-> mov qword [var_38h]
and doing other pretty printing, Cutter.If more stuff gets added to pdj, it might be worth considering adding option for selecting what information caller needs or having multiple commands that print different subsets of information.
@karliss
Expose src and dst fields.
I forgot to mention this. This proposed change was under the assumption that rizin already had this information internally and that it wouldn't take much effort to make it available to the user. That approach you mention sounds rather interesting.
But I don't know src
and dst
fields are formatted, specifically on cases where the instruction has 3 or 4 operands. Here's an example:
{
"offset": 140025470670393,
"esil": "",
"refptr": false,
"fcn_addr": 0,
"fcn_last": 0,
"size": 8,
"opcode": "vpsrld xmm15, xmm9, xmmword [rax + 0x1000000]",
"disasm": "vpsrld xmm15, xmm9, xmmword [rax + 0x1000000]",
"bytes": "c531d2b800000001",
"family": "cpu",
"type": "null",
"reloc": false,
"type_num": 0,
"type2_num": 0
},
Of course this is on the mmx range of instructions, so, not really useful for Function Analysis. But for my use-case which is very much Data-Flow Analysis at the machine language level, it is an important piece of information.
Also, notice that there's no ESIL output, making it unreliable for this use case. (Side question, does ESIL support floating point instructions?)
As for RzAnnotatedCode
is there a way to see the information it contains? This sounds interesting as well. Specially if it helps Visual Mode and Cutter, both of which I use as well and would gladly see them grow.
And finally, thanks for your reply karliss, it's very much appreciated.
This issue has been automatically marked as stale because it has not had recent activity. Considering a lot has probably changed since its creation, we kindly ask you to check again if the issue you reported is still relevant in the current version of rizin. If it is, update this issue with a comment, otherwise it will be automatically closed if no further activity occurs. Thank you for your contributions.
Hi! I think right now there is no work happening in this regard, however there is something called opex
which may be useful in your case.
[0x00006b64]> aoj~{}
[
{
"opcode": "xor ebp, ebp",
"disasm": "xor ebp, ebp",
"pseudo": "ebp = 0",
"description": "logical exclusive or",
"mnemonic": "xor",
"mask": "ffff",
"esil": "ebp,rbp,^,0xffffffff,&,rbp,=,$z,zf,:=,$p,pf,:=,31,$s,sf,:=,0,cf,:=,0,of,:=",
"sign": false,
"prefix": 0,
"id": 334,
"opex": {
"operands": [
{
"size": 4,
"rw": 3,
"type": "reg",
"value": "ebp"
},
{
"size": 4,
"rw": 1,
"type": "reg",
"value": "ebp"
}
],
"modrm": true
},
"addr": 27492,
"bytes": "31ed",
"size": 2,
"type": "xor",
"esilcost": 0,
"scale": 0,
"refptr": 0,
"cycles": 1,
"failcycles": 0,
"delay": 0,
"stackptr": 0,
"family": "cpu"
}
]
Hi, @ret2libc , that looks very much like what I need. I'm terribly sorry for letting this issue become stale.
I've been playing around with aoj and opex, and as far as I can tell it is a perfect fit for what I need, and more. From my part, this issue/feature_req, can be marked as closed.
Thank you very much for all your help.
No problem! Happy to help
Is your feature request related to a problem? Please describe. When using the pdj command, the "disasm" and "opcode" elements are almost always the same, like so:
...there's only a few exceptions like this one:
Also, it gets a bit troublesome trying to parse the "opcode" field itself if, in my use-case for example, I write a python script with rz-pipe that needs to know each and every operand of an instruction, not just the opcode or the complete instruction.
Describe the solution you'd like What is proposed in this issue is a change in the pdj output with the addition of a new json element. The "opcode" field would give only the instruction opcode like
mov
and the operands would separated and given in the new element "operands" as a list of strings, like so:Note that the
qword
formatting element is included in the operands list. While not being necessarily an operand, it is a very important element in an instruction and the special words describing the length of the data are few and easily managed by the user if they don't need them.Describe alternatives you've considered An neat alternative to this is to separate the operands in two lists, the ones before the instruction colon, and after the colon, displaying them as a list of lists, or an array of arrays in json parlance, like so:
This is not necessary for my specific use-case right now but might be a valuable feature in the future for me or someone else.
As for using the output of "opcode" or "disasm", I believe that the later should not be changed because it is the direct output of pd, as you can see here:
So, imho, pdj should have a field with the exact same output as pd.