Closed RabbitDong-on closed 6 months ago
The reason we use FlowDroid to only track local primitive is: 1) we don't want to do any taint analysis online. Taint analysis is expensive and we can't afford it in production environment. 2) Static data-flow analysis is fairly accurate for local primitives.
In your example, yes, FlowDoird/Soot can perform really bad if the program contains complex data dependency between heap objects and local primitives. For ExChain, it does not use track anything dynamically. It only adds exception information to heap objects.
Thanks. Why Exchain does not track anything dynamically? Exchain firstly infers affected state variable and responsible state variable. For heap objects, Exchain taints affected state variable to responsible state variable by Phosphor. This procedure is not online. So you add exception information to heap objects and record them. Then you analyze information to get dependency, right? For example, P1: try{ responsible variable }catch(Exception e1){ write heap_obj1 }
P2: heap_obj2=func(heap_obj1) heap_obj3=func(heap_obj2) P3 try{ if(check(heap_obj3){ throw e2; } }catch(Exception e2){ handleE2(); } For e1, affected state variable is heap_obj1 For e2, responsible state variable is heap_obj3 taint path is heap_obj1->heap_obj2->heap_obj3. Exchain does not analysis this taint path online, right? Exchain just add exception information to heap_obj1,heap_obj2,heap_obj3 for offline dependency analysis, right? If like this, should exchain add exception information to which heap objects?
Because the overhead introduced by taint analysis is too high.
In your example, if func
copy heap_obj1
to heap_obj2
, ExChain cannot track this propagation since it requires tracking the propagation among heap objects.
Thanks. I am curious that how to add exception information to heap objects for heap-to-heap analysis. I did not find it in the paper. If i miss something, I would so thankful for mentioning it.
Yes, current implementation supports that! It uses phosphor to perform full dynamic taint analysis (will introduce 10X overhead).
You can do it by
python3 runner.py wicket_6908 build
python3 runner.py wicket_6908 instrument
python3 runner.py wicket_6908 run --type dynamic
The main difference is how to update and propagate taint information: https://github.com/aoli-al/exchain/blob/main/runtime/src/main/kotlin/al/aoli/exchain/runtime/analyzers/AffectedVarDriver.kt#L167 and https://github.com/aoli-al/exchain/blob/main/native/src/cpp/affected_result_processor.cc#L147
Thanks! It helps me a lot. I will close this issue.
public class Flow { public static int data1=-1; public static int flow1(int count){ data1++; if(count<1){ count++; } return count; } public static int flow2(int count){ count=count+data1; if(count<2){ count++; } return count; } public static void entryMethod(){ int flowdatat1=0; data1=flow1(flowdata1); flow2(flowdata1); } }
hi aoli, i see that your work use flowdroid for local primitive taint analysis. The reason is that flowdroid cannot find flow path between flow1 and flow2, right? Flowdroid can transfer taint variable via assign|call parameter|call return value instead of shared memory data1, right? So, you use phosphor to trace the taint path between flow1 and flow2 via data1. My understanding is correct? If i misunderstand your work, please correct me, thanks.