Closed PriyankaPanigrahi closed 4 years ago
Say you have this program, x.c:
#include "taintgrind.h"
int main(int argc, char **argv) {
int x=10, y;
TNT_TAINT(&x, sizeof(x));
y=x;
return y;
}
Compile it with:
taintgrind$ gcc -g -O0 -o tests/x -I. -I../include tests/x.c
Run taintgrind with:
taintgrind$ ../build/bin/valgrind --tool=taintgrind tests/x 10
==4507== Taintgrind, the taint analysis tool
...
==4507== Command: tests/x 10
0x10895A: main (x.c:8) | mov eax, dword ptr [rbp - 0x50] | Load | 0xa | t20_8260 <- x:1ffefffd10
...
0x10895D: main (x.c:8) | mov dword ptr [rbp - 0x4c], eax | Store | 0xa | y:1ffefffd14 <- t23_9256
You should get these two lines in the output, and the following taintgraph:
Thank you so much for your reply.
I have already done this much. But, if we have a large number of variables in a source code, let 100 and 20 are tainted variables. Its difficult to check the taint flow manually in the taint graph.
Is it possible to print the taintness (tainted/untainted) of each variable at the end of program ?
Any advice will be helpful. Thank you for your time. Have a great day.
I've just added a new client request: TNT_IS_TAINTED. This allows you to read the taint bits of a variable that you specify at run-time. Have a look at the test case tests/checktaint.c to see if this is useful for you.
If you run this test case, you should get:
a is_tainted: ffffffff
b is_tainted: 00000000
c[0] is_tainted: 00000000
c[1] is_tainted: 00000000
c[2] is_tainted: 00000000
c[3] is_tainted: 00000000
c[4] is_tainted: 00000000
c[5] is_tainted: ffffffff
c[6] is_tainted: 00000000
c[7] is_tainted: ffffffff
c[8] is_tainted: 00000000
c[9] is_tainted: 00000000
How does it allow us to read the taint bits of a variable to specify at run-time?
If you look at tests/checktaint.c, the lines that print the taint bits are:
TNT_IS_TAINTED(t, &a, sizeof(a));
printf("a is_tainted: %08x\n", t);
TNT_IS_TAINTED() takes 3 arguments: the output variable (unsigned int), an address, and the number of bytes to read. The taint bits will be written to t, which you can then print out.
From what you're saying, the taintgrind output, which is written to stderr,
is getting in the way.
One way is to pipe stderr to /dev/null; another option is to save stdout to
a file, e.g.
Hope that helps.
On Thu, Sep 19, 2019 at 10:14 PM PriyankaPanigrahi notifications@github.com wrote:
I made all the changes as you suggested.
I am getting output for various functions, such as _itoa_word, vfprintf, _IO_file_xsputn@@GLIBC_2.2.5, __memcpy_avx_unaligned_erms and many more.
But not getting the desired output. I am not able to understand where I am wrong.
Please suggest.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/wmkhoo/taintgrind/issues/37?email_source=notifications&email_token=AAM4GSYHZTBGLMMY3P45VQLQKOCKXA5CNFSM4IV7HPZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7DTTFA#issuecomment-533150100, or mute the thread https://github.com/notifications/unsubscribe-auth/AAM4GS7NEO6QGUI7J2FUUTTQKOCKXANCNFSM4IV7HPZQ .
I've just added a new client request: TNT_IS_TAINTED. This allows you to read the taint bits of a variable that you specify at run-time. Have a look at the test case tests/checktaint.c to see if this is useful for you.
If you run this test case, you should get:
a is_tainted: ffffffff
b is_tainted: 00000000
c[0] is_tainted: 00000000
c[1] is_tainted: 00000000
c[2] is_tainted: 00000000
c[3] is_tainted: 00000000
c[4] is_tainted: 00000000
c[5] is_tainted: ffffffff
c[6] is_tainted: 00000000
c[7] is_tainted: ffffffff
c[8] is_tainted: 00000000
c[9] is_tainted: 00000000
Thank you so much for your help.
I am getting the same output as you mentioned. Does it mean: if we get. ffffffff, it means tainted and for 00000000 means untainted? What is the significance of these output values?
Not just that. ffffffff means all 32 bits are tainted.
Thank you very much for your reply.
Just to add that the number of bytes that TNT_IS_TAINTED can accept are: 1, 2, 4 and 8.
So, the last argument of TNT_IS_TAINTED(), can be 1, 2, 4 or 8.
On Sat, 21 Sep 2019, 6:45 am Wei Ming Khoo, notifications@github.com wrote:
Just to add that the number of bytes that TNT_IS_TAINTED can accept are: 1, 2, 4 and 8.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/wmkhoo/taintgrind/issues/37?email_source=notifications&email_token=AM6LWNU7QSZJSGLMRBVITTLQKVYTFA5CNFSM4IV7HPZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7IHOEQ#issuecomment-533755666, or mute the thread https://github.com/notifications/unsubscribe-auth/AM6LWNXWZZ5JS6Z3GDLOKQDQKVYTFANCNFSM4IV7HPZQ .
For source code:
int main(int argc, char **argv) { int a = 1000, b; unsigned int t; // Defines int a as tainted TNT_TAINT(&a,sizeof(a));
if (a==1000)
b=1;
else
b=0;
TNT_IS_TAINTED(t, &a, sizeof(a));
printf("a is_tainted: %x\n", t);
TNT_IS_TAINTED(t, &b, sizeof(b));
printf("b is_tainted: %x\n", t);
return 0;
}
variable b should be tainted or not, as the value of untainted variable "b" depends on the value of tainted variable "a"?
Let's take a similar example (http://bitblaze.cs.berkeley.edu/papers/dta%2B%2B-ndss11.pdf Fig. 3):
char output[256]; long input = user_input(); long len = 0; if (input > 100) { strcpy(output, "large"); len = 5; } else { strcpy(output, "small"); len = 5; } print_output(output, len);
In this case, is len dependent on input?
Whatever the value of input, len will print 5 only. So, it is not dependent.
On Sat, 21 Sep 2019, 5:40 pm Wei Ming Khoo, notifications@github.com wrote:
Let's take a similar example ( http://bitblaze.cs.berkeley.edu/papers/dta%2B%2B-ndss11.pdf Fig. 3):
char output[256]; long input = user_input(); long len = 0; if (input > 100) { strcpy(output, "large"); len = 5; } else { strcpy(output, "small"); len = 5; } print_output(output, len);
In this case, is len dependent on input?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/wmkhoo/taintgrind/issues/37?email_source=notifications&email_token=AM6LWNV72ZALQOXCVPJPKMLQKYFKVA5CNFSM4IV7HPZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7IQSZQ#issuecomment-533793126, or mute the thread https://github.com/notifications/unsubscribe-auth/AM6LWNWNQPTYJXKJV3QDKFLQKYFKVANCNFSM4IV7HPZQ .
There are at least three types of taint dependency. Let x be tainted.
Taintgrind, which follows Valgrind memcheck, only implements 1, not 2 or 3. This means it will under-taint, i.e. it will miss some dependencies. On the other hand, it is tricky to handle 2 and 3, as it may lead to over-tainting, i.e. reporting dependencies where there is none. For more info on taint analysis, check out https://users.ece.cmu.edu/~aavgerin/papers/Oakland10.pdf.
Thank you for your reply.
As taintgrind is under-taint, is there any other tool, which can address all the 1, 2, and 3, or there is no other tool because of over-tainting.
Let me add that although under-tainting will miss some dependencies, it can still be useful.
Some other dynamic taint analysis tools I'm aware of, but have not tried (and the info may not be up-to-date):
If you want to experiment with and implement different taint rules, I hear that bap will let you do that (but again, I have not tried it).
Thank you very much for your valuable reply. It's a really great help.
Let in a program a variable 'x' is tainted. There is an assignment 'y=x' where y is untainted. How to check the taintflow in the output or data flow graph ?
Any suggestions?
Thank you. Have a great day.