zyantific / zydis

Fast and lightweight x86/x86-64 disassembler and code generation library
https://zydis.re
MIT License
3.4k stars 434 forks source link

Add size and offset to immediate and memory operand structs #489

Closed NaC-L closed 6 months ago

NaC-L commented 7 months ago

Access to size of displacement value from operand

athre0z commented 7 months ago

This information is already available in the public interface via instruction->raw.disp.size. Putting it where you are suggesting would admittedly be a bit nicer for discovery, but at the cost of 8 byte (after padding) times max number of operands of stack alloc. Our structs are quite large already and I'd prefer not to add to that unless we really have to.

flobernd commented 7 months ago

Thanks for your contribution!

I agree with @athre0z. Besides the already mentioned reason, ZydisDecodedOperand is meant to primarily hold the semantic/logical information about operands where as the "raw" struct shall be used to access details about the physical encoding.

NaC-L commented 7 months ago

Hey, thanks for the response.

What about something like this?

 struct ZydisDecodedOperandMemDisp_
    {
        /**
         * The displacement value
         */
        ZyanI64 value;

        /**
        * The size of the displacement value
        */
        ZyanU8 size;
    } disp;

If the size is 0, then it means no displacement value and we get rid of the ZyanBool has_displacement; so we can use the same amount of bytes. I understand the purpose of ZydisDecodedOperand and raw struct. However with this new way, replacing has_displacement with size we can use the same size and hold more info. The bytes are used anyways, why not make more use of them?

flobernd commented 7 months ago

I would be ok with merging this. What do you think @athre0z?

We have a bunch of interface breaking changes anyways.

flobernd commented 7 months ago

When we are on it, we should as well change the comment slightly to make it clear that this field contains the physical displacement size.

athre0z commented 7 months ago

Hmm yeah, that seems like a decent idea.

If we move the size into the operand, we should also move the offset field. I don't know how you'd reliably use the size without also having the offset. We should further apply the same change for immediate operands for consistency. ZydisDecodedOperandImm is used in a union with ZydisDecodedOperandMem, and the former is much smaller than the latter, so adding the extra fields there doesn't cost us anything.

Looking at pahole output:

struct ZydisDecodedOperandMemDisp_ {
        ZyanBool                   has_displacement;     /*     0     1 */

        /* XXX 7 bytes hole, try to pack */

        ZyanI64                    value;                /*     8     8 */

        /* size: 16, cachelines: 1, members: 2 */
        /* sum members: 9, holes: 1, sum holes: 7 */
        /* last cacheline: 16 bytes */
};

my previous claim that adding a new field would increase struct size was in fact inaccurate: there are 7 padding bytes waiting to be put to use. In that sense moving the offset over as well is actually free. We could also keep has_displacement, but honestly there isn't really any good reason: using size for both seems more elegant.

flobernd commented 6 months ago

LGTM from my side! Thanks 🙂

flobernd commented 6 months ago

@athre0z @mappzor Are you fine with merging this as well?

mappzor commented 6 months ago

LGTM!