github / codeql

CodeQL: the libraries and queries that power security researchers around the world, as well as code scanning in GitHub Advanced Security
https://codeql.github.com
MIT License
7.66k stars 1.53k forks source link

UseAfterFree.ql miss case 01 #13897

Open 18Fl opened 1 year ago

18Fl commented 1 year ago

Hey, When I try to learn codeql dataflow analysis from UseAfterFree.ql, I found it miss handle some case like I mentioned in github slack. I just paste it at here:

This code can't handle by UseAfterFree.ql , I don't know why. I previouly think it should be caused by "flow after or before mis match"(Which I mentioned in github slack), But seems it's not the root cause . Because the UseAfterFree.ql always use as.Expr() which still handle some case. So I don't know how to inverstigate this one:

#include <stdlib.h>
#include <stdio.h>

struct MyStruct {
  char* buf;
};

// Use-after-free of `buf` field.
static void test0100() {
  struct MyStruct* s = (struct MyStruct*)malloc(sizeof(struct MyStruct));
  s->buf = (char *)malloc(0x1000);
  sprintf(s->buf, "kevwozere: %d\n", 100);
  free(s->buf);
  s->buf[0] = 0x41;
  free(s);
}

int main() {
  test0100();
  return 0;
}
MathiasVP commented 1 year ago

Hi @18Fl,

Thanks for bringing this to our attention. The problem is that the query uses asExpr (as you currently noted) which gets the underlying expression of the dataflow node. However, for some cases it's important that we grab the value of the expression after a call that may have updated what the pointer points to. This is important in a case like your free(s->buf) where we (for technical reasons) need to use asDefiningArgument to correctly propagate this "updated write" on buf to s so that we correctly track the flow all the way to the use of s->buf[0] on the next line.

I tried to replace all of our uses of asExpr in the query with asDefiningArgument (and asIndirectExpr in the corresponding sinks), but it had a few regressions in a few places that we need to investigate.

I'll add your example to our tracking issue for this query and be sure to mention on this issue whenever this is fixed.

18Fl commented 1 year ago

Thanks for bringing this to our attention. The problem is that the query uses asExpr (as you currently noted) which gets the underlying expression of the dataflow node. However, for some cases it's important that we grab the value of the expression after a call that may have updated what the pointer points to. This is important in a case like your free(s->buf) where we (for technical reasons) need to use asDefiningArgument to correctly propagate this "updated write" on buf to s so that we correctly track the flow all the way to the use of s->buf[0] on the next line.

I tried to replace all of our uses of asExpr in the query with asDefiningArgument (and asIndirectExpr in the corresponding sinks), but it had a few regressions in a few places that we need to investigate.

I'll add your example to our tracking issue for this query and be sure to mention on this issue whenever this is fixed.

Thanks for your reply. I will wait for this. Because I have tried change asExpr -> |asDefiningArgument| before , But I failed. It still can't handle this code. I am so exiting to look the right answer , and I could learn more thing.

Another problem is when I saw the code in the UseAfterFree.ql, It always use asExpr, we don't get anycode |asDefiningArgument| in it. But it still work for some code which won't work If we use old Dataflow. I believe this caused by def-use to use->use. But I failed understand the blog post because It is not in deatils(Maybe it is good for another people which is good at Codeql or dataflow analysis area, But the one is not me :( ).

This is what I observed, hope it will help u a little when u work on fix the issue.

And another code sample will maybe help u fix the bug:

#include <stdlib.h>
#include <stdio.h>

struct MyStruct {
  char* buf;
};

// Use-after-free of `buf` field.
static void test0100() {
  struct MyStruct* s = (struct MyStruct*)malloc(sizeof(struct MyStruct));
  s->buf = (char *)malloc(0x1000);
  char * s_buf = s->buf; // [+] I write the code just at here, so maybe it will caused grammer error, I hope it not
  sprintf(s->buf, "kevwozere: %d\n", 100);
  free(s_buf);
  s_buf[0] = 0x41;
  free(s);
}

int main() {
  test0100();
  return 0;
}

Thanks!