eellak / build-recorder

GNU Lesser General Public License v2.1
23 stars 8 forks source link

`PROCESS_INFO` structure issues, missing free on PROCESS_INFO::entry_info in case of syscall failure #181

Open fvalasiad opened 1 year ago

fvalasiad commented 1 year ago

Well, this one sucks.

So far syscall handlers would use said field for any purpose, namely to store data directly or to store a pointer to data. Pointers to data are dynamically allocated and thus, they need to be freed. If the syscall returns with a status of 0, all is good, resource is freed from the exit handler.

What about the other scenario though? What if the syscall return with a non-zero number indicating an error? handle_syscall_exit returns directly, leading to a memory leak.

How do we go about solving this? Do we just assign NULL every time we "consume" the information held in entry_info and call free if the value is not-null after handle_syscall_exit has returned? We cannot since as previously mentioned, the field is used for both pointers and other data.

Well we could add a flag for that, but this is already getting too messy. The moment you consider plugging last minute decision booleans all around, your code is a mess.

You might think, only a subset of system call handlers use this mechanism, and even less the count of those system calls that fail without leading to program termination. We can afford to leak SOME memory. But this brings me to question the code structure itself.

What does PROCESS_INFO contain? These:

typedef struct {
    char outname[16];
    char *cmdline;
    int *fds;
    FILE_INFO *finfo;
    int numfinfo;
    int finfo_size;
    struct ptrace_syscall_info state;
    void *entry_info;
    char ignore_one_sigstop;
} PROCESS_INFO;

The key to understanding why this pack is wrong, is thinking about how to make the whole structure with all operations that apply to it(in an object oriented fashion) platform-agnostic. A.K.A which parts are not portable?

struct ptrace_syscall_info state;
void *entry_info;
char ignore_one_sigstop;

Not only do these three make the code not-reusable for say, porting build-recorder to FreeBSD, they are also wrongly packed with the rest in terms of optimal cache usage.

char ignore_one_sigstop;

is, essentially, a last minute decision boolean. After we've reached the first syscall, it's just an extra byte wasted, contributing to the pollution of our cache lines.

struct ptrace_syscall_info state;

is a heavy object, also contributing massively to cache pollution. Two interesting facts about it:

  1. It contains unnecessary info. We are only interested in the state::entry field inside the union, the rest is garbage.
  2. It's the field that is read THE MOST. Sure, the rest of the structure contains most of build-recorder's context, but that field is read and written to upon every system call, unlike the rest of the structure which is only used for a very small minority of system calls that we care about tracing.

void *entry_info;

Pretty much all of the sins mentioned above, badly designed mechanism.

What to do

  1. Pull state and entry_info out of PROCESS_INFO into separate maps(again, pids to fields respectively).
  2. Remove ignore_one_sigstop completely and instead have a set of pids that contains all of the processes which haven't yet found their first sigstop.

What do we gain

  1. Performance?, Irrelevant, the count of processes is low, a standard profiling shows that pinfo is fine.
  2. Better code structure overall.
  3. PROCESS_INFO is now portable, which means we can pull it into a header file with all its methods and re-use it in our future ports.