wmkhoo / taintgrind

A taint-tracking plugin for the Valgrind memory checking tool
GNU General Public License v2.0
249 stars 42 forks source link

[Question] Tainting the file #30

Closed marekzmyslowski closed 4 years ago

marekzmyslowski commented 5 years ago

I'm using the file tainting option. I was wondering if there is a way to find what bytes from a file were tainted to the particular instruction.

wmkhoo commented 5 years ago

That seems doable. Parse the taint log and recursively retain data flow to the target instruction, eliminate all other flows. Is it enough to know the byte values, or is the byte location important as well?

marekzmyslowski commented 5 years ago

So the location is at some point more important then the value itself.

marekzmyslowski commented 5 years ago

My main current issue is this: 0x4DBAB48: __memcpy_avx_unaligned_erms (memmove-vec-unaligned-erms.S:293) | vmovdqu xmmword ptr [rdi], xmm0 | Store | na | 1ffefffb50_unknownobj <- t0_10665 and this: 0x10893C: test_avBranch (in /work/taint/fuzzer-test-crashes/avBranch.test) | movzx eax, byte ptr [rax + 1] | Load | 0x56 | t12_25279 <- 1ffefffb51_unknownobj There is no simple way to read from the log that variable 1ffefffb51_unknownobj is tainted via first instruction with t0_10665. Adding the following logs could solve that issue:

0x4DBAB48: __memcpy_avx_unaligned_erms (memmove-vec-unaligned-erms.S:293) | vmovdqu xmmword ptr [rdi], xmm0 | Store | na | 1ffefffb50_unknownobj <- t0_10665
0x4DBAB48: __memcpy_avx_unaligned_erms (memmove-vec-unaligned-erms.S:293) | 1ffefffb51_unknownobj <- t0_10665
0x4DBAB48: __memcpy_avx_unaligned_erms (memmove-vec-unaligned-erms.S:293) | 1ffefffb52_unknownobj <- t0_10665
etc.

Is this feasible with the current design? (Or at least add the size of each Store and Load instruction)

wmkhoo commented 5 years ago

Could you craft a simple test case (.c) that exhibits the above behaviour? Thanks

wmkhoo commented 5 years ago

The above patch will print the byte location when reading from a file. When the taint graph is generated, all tainted bytes will appear in a function called 'taint_byte'. From there, you can prune/slice the graph starting from the instruction you're interested in.