Vector35 / binaryninja-api

Public API, examples, documentation and issues for Binary Ninja
https://binary.ninja/
MIT License
946 stars 214 forks source link

Option for manually split /unsplit variable store #4168

Open op2786 opened 1 year ago

op2786 commented 1 year ago

Neither validFieldsOnly nor allOffsets setting for load /store splitting does not effect the example code below. I believe that there will be some cases like below where manually split /unsplit option would be useful.

void init_proxy_ctx()

00410d50  init_wsa_ctx(&proxy_ctx.local_wsa)
00410d5a  init_wsa_ctx(&proxy_ctx.remote_wsa)
00410d67  proxy_ctx.local_thread.o = zx.o(0)
00410d6e  InitializeCriticalSection(&proxy_ctx.lock)

proxy_ctx structure:

struct proxy_ctx
{
    int32_t waiting;
    struct wsa_ctx local_wsa;
    int32_t field_1e0;
    struct wsa_ctx remote_wsa;
    int32_t field_3c0;
    HANDLE local_thread;
    int32_t field_3c8;
    HANDLE remote_thread;
    int32_t field_3d0;
    struct cc_handler* cc_handler;
    struct RTL_CRITICAL_SECTION lock;
};
rssor commented 1 year ago

So there's a known gap where unless a write straddles multiple actual members, it won't split, which I meant to (and forgot to) relax before it shipped.

That doesn't look like what's happening in your example, so I'm actually kind of mystified by that behavior. Is the field_ec8 member actually present or is that just padding that got added in the c representation of the type?

We do definitely need the ability to suppress/unsplit operations though, and this is something we'll need in general for a variety of other things (outlining, impending adjusted enum/comparison work). Manually splitting gets a little trickier, as the only situations it won't split are if it fails to figure out how to split, and an API expressive enough to describe how to split memory accesses manually is probably unviable. It's performed during MLILTranslator by consuming type info and pattern matching, and there's no good way to 'point to' the relevant instructions you want split in a robust/stable way, especially since a (arch, addr) tuple can have an unlimited amount of instructions with large expression trees.

If splitting is doing the wrong thing, on the other hand, (arch, addr) being used to disable the transformation at specific sites is going to basically always do the right thing.

xusheng6 commented 1 year ago

Database shared in private with us. Anyone from vector35 should search "16260762028509746446" for it