secure-software-engineering / FlowDroid

FlowDroid Static Data Flow Tracker
GNU Lesser General Public License v2.1
1.05k stars 297 forks source link

Testing FlowDroid #147

Closed Sebastiaan-Alvarez-Rodriguez closed 5 years ago

Sebastiaan-Alvarez-Rodriguez commented 5 years ago

Hello!

A little background

I am writing a framework to execute static android security analysis tools, and I would like to inlcude FlowDroid in it to show what it can do!

Issue

I have a (small) apk, which I test with to see if my framework works okay. I found that Flowdroid does not work with my test apk:

[main] ERROR soot.jimple.infoflow.android.SetupApplication$InPlaceInfoflow - No 
sources found, aborting analysis

My apk is malware, it has sources and sinks, confirmed by multiple analysers. e.g.: I have used IccTa (with IC3), which also uses soot. There, I got a welcoming

[main] INFO soot.jimple.infoflow.data.pathBuilders.ContextSensitivePathBuilder - Obtainted 3
connections between sources and sinks

I thought: Maybe it is because of the SourcesSinks definition text file (maybe IccTa uses more fitting source/sink rules for this apk), but: After launching flowdroid with the SourcesSinks-file of IccTa, the error persisted.

Question

Do you know, by any chance, how Flowdroid's soot cannot find sources, while IccTa's soot does find them, with the same source (and sink)-definition file?

With kind regards, Sebastiaan

StevenArzt commented 5 years ago

Since ICCTA relies on FlowDroid for the data flow tracking (and has become a part of FlowDroid about a year ago), there should not be a difference in the number of sources found by FlowDroid and ICCTA. If you use the most recent version of the tool, ICCTA is simply a command-line option of FlowDroid. If you tell FlowDroid to use an ICC model, it will internally apply the ICCTA instrumenter to connect the respective Android components in the callgraph. My best guess is that, since you seem to be referring to the "old" ICCTA standalone tool, there is simply a different version of FlowDroid involved. Of course, it would be bad if the current version no longer detected sources that were found by an older version.

Can you please provide the APK and the sources/sinks file that works with ICCTA, but not with FlowDroid?

Sebastiaan-Alvarez-Rodriguez commented 5 years ago

Hello! Thanks for your quick reply. I knew IccTa relies on FlowDroid for the dataflow tracking, but did not have any idea IccTa became a part of FlowDroid.

I did not tell FlowDroid to use an ICC model. I use FlowDroid like:

java -jar soot-infoflow-cmd/target/soot-infoflow-cmd-jar-with-dependencies.jar \
    -a <APK File> \
    -p <Android JAR folder> \
    -s <SourcesSinks file> \
    -o <Output file>

I found an option to add an ICC-model -im,--iccmodel but I am afraid I do not know how I should get such model. How can I get a ICC-model to give to FlowDroid?

I am a bit hesitant to give the APK, as it is part of androzoo dataset, which I have access to, which states I should not redistribute without their consent.

The source/sinks file that works for IccTa is easy: default file of IccTa I tested FlowDroid latest release with the default sources/sinks file (did not find anything) and with IccTa's default source/sinks file (did not find anything).

It seems I am not using your tool correctly, as I need to specify an ICC-model (but somehow I do not have to with IccTa). How do I generate an ICC-model which I can give to FlowDroid?

By the way: Thanks for your time!

StevenArzt commented 5 years ago

Both FlowDroid and the old ICCTA command-line too rely on an external model for the ICC links. In the original work, we used IC3 to generate such a model. Harvester is also an option, but it's not open source. In the end, it's just a mapping from the statement that triggers the ICC communication to the target component. For the test cases included with FlowDroid, there are also models included under "soot-infoflow-android/iccta_testdata_ic3_results". If you have your own apps, you need to create your own models.

In general, I'm slightly cofused here, because detecting sources should not require ICC links. So even if you are running without any ICC support, FlowDroid should still find the sources.

Sebastiaan-Alvarez-Rodriguez commented 5 years ago

I applied IC3 for IccTa. I will try to use it to generate a model. I am also slightly confused. Here is the full execution log:

[main] INFO soot.jimple.infoflow.cmd.MainClass - Analyzing app /home/s1810979/testset/androzoo/apk/adsvr.soporteweb.es.apk (1 of 1)...
[main] INFO soot.jimple.infoflow.android.SetupApplication - Initializing Soot...
[main] INFO soot.jimple.infoflow.android.SetupApplication - Loading dex files...
[main] INFO soot.jimple.infoflow.android.SetupApplication - ARSC file parsing took 0.020501316 seconds
[main] INFO soot.jimple.infoflow.android.entryPointCreators.AndroidEntryPointCreator - Creating Android entry point for 1 components...
[main] INFO soot.jimple.infoflow.android.SetupApplication - Constructing the callgraph...
[main] INFO soot.jimple.infoflow.android.callbacks.DefaultCallbackAnalyzer - Collecting callbacks in DEFAULT mode...
[main] INFO soot.jimple.infoflow.android.callbacks.DefaultCallbackAnalyzer - Callback analysis done.
[main] WARN soot.jimple.infoflow.android.resources.LayoutFileParser - Could not find layout class rotate
[main] INFO soot.jimple.infoflow.android.entryPointCreators.AndroidEntryPointCreator - Creating Android entry point for 1 components...
[main] INFO soot.jimple.infoflow.android.SetupApplication - Constructing the callgraph...
[main] INFO soot.jimple.infoflow.android.callbacks.DefaultCallbackAnalyzer - Running incremental callback analysis for 1 components...
[main] INFO soot.jimple.infoflow.android.callbacks.DefaultCallbackAnalyzer - Incremental callback analysis done.
[main] INFO soot.jimple.infoflow.android.entryPointCreators.AndroidEntryPointCreator - Creating Android entry point for 1 components...
[main] INFO soot.jimple.infoflow.android.SetupApplication - Constructing the callgraph...
[main] INFO soot.jimple.infoflow.android.callbacks.DefaultCallbackAnalyzer - Running incremental callback analysis for 0 components...
[main] INFO soot.jimple.infoflow.android.callbacks.DefaultCallbackAnalyzer - Incremental callback analysis done.
[main] INFO soot.jimple.infoflow.memory.MemoryWarningSystem - Shutting down the memory warning system...
[main] INFO soot.jimple.infoflow.android.SetupApplication - Callback analysis terminated normally
[main] INFO soot.jimple.infoflow.android.SetupApplication - Entry point calculation done.
[main] INFO soot.jimple.infoflow.android.source.AccessPathBasedSourceSinkManager - Created a SourceSinkManager with 39 sources, 126 sinks, and 13 callback methods.
[main] INFO soot.jimple.infoflow.android.SetupApplication - Collecting callbacks and building a callgraph took 2 seconds
[main] INFO soot.jimple.infoflow.android.SetupApplication - Running data flow analysis on /home/s1810979/testset/androzoo/apk/adsvr.soporteweb.es.apk with 39 sources and 126 sinks...
[main] INFO soot.jimple.infoflow.InfoflowConfiguration - Implicit flow tracking is NOT enabled
[main] INFO soot.jimple.infoflow.InfoflowConfiguration - Exceptional flow tracking is enabled
[main] INFO soot.jimple.infoflow.InfoflowConfiguration - Running with a maximum access path length of 5
[main] INFO soot.jimple.infoflow.InfoflowConfiguration - Using path-agnostic result collection
[main] INFO soot.jimple.infoflow.InfoflowConfiguration - Recursive access path shortening is enabled
[main] INFO soot.jimple.infoflow.InfoflowConfiguration - Taint analysis enabled: true
[main] INFO soot.jimple.infoflow.InfoflowConfiguration - Using alias algorithm FlowSensitive
[main] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceInfoflow - Callgraph construction took 0 seconds
[main] INFO soot.jimple.infoflow.codeOptimization.InterproceduralConstantValuePropagator - Removing side-effect free methods is disabled
[main] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceInfoflow - Dead code elimination took 0.095517504 seconds
[main] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceInfoflow - Callgraph has 191 edges
[main] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceInfoflow - Starting Taint Analysis
[main] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceInfoflow - Using context- and flow-sensitive solver
[main] WARN soot.jimple.infoflow.android.SetupApplication$InPlaceInfoflow - Running with limited join point abstractions can break context-sensitive path builders
[main] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceInfoflow - Looking for sources and sinks...
[main] ERROR soot.jimple.infoflow.android.SetupApplication$InPlaceInfoflow - No sources found, aborting analysis
[main] INFO soot.jimple.infoflow.memory.MemoryWarningSystem - Shutting down the memory warning system...
[main] WARN soot.jimple.infoflow.android.SetupApplication$InPlaceInfoflow - No results found.
[main] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceInfoflow - Data flow solver took 0 seconds. Maximum memory consumption: 236 MB
[main] INFO soot.jimple.infoflow.android.SetupApplication - Found 0 leaks

It's quite sad, because

[main] INFO soot.jimple.infoflow.android.source.AccessPathBasedSourceSinkManager - Created a SourceSinkManager with 39 sources, 126 sinks, and 13 callback methods.

seemed so promising. Is my maximum access path length of 5 too small maybe?

Sebastiaan-Alvarez-Rodriguez commented 5 years ago

Hello again, sorry for late reaction, but I just figured I had not given you the information about the apk: It is in the androzoo malware dataset (which you might have access to too, considering this framework you are/were building). Here are the details:

sha256: 20BD4735D2E3F1FBDFAE196FECB00A80E7258C7A84785ED92FEC2C019B0AF76F
sha1: C0F7A50701E06D94BCF2309221E2AD2A4B938147
md5: F77D96EA77AD481630A3C3C2717BF83D
dex_date: 2018-06-06 23:37:14
apk_size: 182484
pkg_name: "adsvr.soporteweb.es"
vercode: 2
vt_detection: 15
vt_scan_date: 2018-11-19 06:38:00
dex_size: 21704
markets: play.google.com

With this info you can identify uniquely which exact apk I used for an initial test. Perhaps you can look at the apk if you plan to improve FlowDroid. It should help you, I think.

For now: I will test latest and greatest FlowDroid 2.7.1 cmd-jar-with-dependencies found here on a bunch of other apks.

If I can get FlowDroid to work on other apks, I can continue building my framework! (of course I'll come back here to close the issue)

If I can't get FlowDroid to work on other apk's too, I probably am doing something really wrong in terms of specifying commandline arguments to Flowdroid. (or I use the wrong tool and should use `summaries-jar-with-dependencies)

Sebastiaan-Alvarez-Rodriguez commented 5 years ago

Back again! I have great news: Your tool works now, sort of: It works in 7/10 of my test apks. Here below a log of a successful run

[main] INFO soot.jimple.infoflow.cmd.MainClass - Analyzing app /home/radon/Uni/brp/testset/safe/test.apk (1 of 1)...
[main] INFO soot.jimple.infoflow.android.SetupApplication - Initializing Soot...
[main] INFO soot.jimple.infoflow.android.SetupApplication - Loading dex files...
[main] INFO soot.jimple.infoflow.android.SetupApplication - ARSC file parsing took 0.356256556 seconds
[main] INFO soot.jimple.infoflow.android.entryPointCreators.AndroidEntryPointCreator - Creating Android entry point for 51 components...
[main] INFO soot.jimple.infoflow.android.SetupApplication - Constructing the callgraph...
[main] INFO soot.jimple.infoflow.android.callbacks.DefaultCallbackAnalyzer - Collecting callbacks in DEFAULT mode...
[main] INFO soot.jimple.infoflow.android.callbacks.DefaultCallbackAnalyzer - Callback analysis done.
[main] WARN soot.jimple.infoflow.android.resources.LayoutFileParser - Could not find layout class DateTimeView
[main] INFO soot.jimple.infoflow.android.entryPointCreators.AndroidEntryPointCreator - Creating Android entry point for 56 components...
...
[main] WARN soot.jimple.infoflow.android.entryPointCreators.components.ActivityEntryPointCreator - Cannot generate constructor for phantom class android.app.Fragment
[main] INFO soot.jimple.infoflow.android.SetupApplication - Constructing the callgraph...
[main] INFO soot.jimple.infoflow.android.callbacks.DefaultCallbackAnalyzer - Running incremental callback analysis for 0 components...
[main] INFO soot.jimple.infoflow.android.callbacks.DefaultCallbackAnalyzer - Incremental callback analysis done.
[main] INFO soot.jimple.infoflow.memory.MemoryWarningSystem - Shutting down the memory warning system...
[main] INFO soot.jimple.infoflow.android.SetupApplication - Callback analysis terminated normally
[main] INFO soot.jimple.infoflow.android.SetupApplication - Entry point calculation done.
[main] INFO soot.jimple.infoflow.android.source.AccessPathBasedSourceSinkManager - Created a SourceSinkManager with 51 sources, 138 sinks, and 516 callback methods.
[main] INFO soot.jimple.infoflow.android.SetupApplication - Collecting callbacks and building a callgraph took 91 seconds
[main] INFO soot.jimple.infoflow.android.SetupApplication - Running data flow analysis on /home/radon/Uni/brp/testset/safe/test.apk with 51 sources and 138 sinks...
[main] INFO soot.jimple.infoflow.InfoflowConfiguration - Implicit flow tracking is enabled
[main] INFO soot.jimple.infoflow.InfoflowConfiguration - Exceptional flow tracking is enabled
[main] INFO soot.jimple.infoflow.InfoflowConfiguration - Running with a maximum access path length of 500
[main] INFO soot.jimple.infoflow.InfoflowConfiguration - Using path-agnostic result collection
[main] INFO soot.jimple.infoflow.InfoflowConfiguration - Recursive access path shortening is enabled
[main] INFO soot.jimple.infoflow.InfoflowConfiguration - Taint analysis enabled: true
[main] INFO soot.jimple.infoflow.InfoflowConfiguration - Using alias algorithm FlowSensitive
[main] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceInfoflow - Callgraph construction took 0 seconds
[main] INFO soot.jimple.infoflow.codeOptimization.InterproceduralConstantValuePropagator - Removing side-effect free methods is disabled
[main] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceInfoflow - Dead code elimination took 5.869811027 seconds
[main] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceInfoflow - Callgraph has 126786 edges
[main] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceInfoflow - Starting Taint Analysis
[main] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceInfoflow - Using context- and flow-sensitive solver
[main] WARN soot.jimple.infoflow.android.SetupApplication$InPlaceInfoflow - Running with limited join point abstractions can break context-sensitive path builders
[main] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceInfoflow - Looking for sources and sinks...
[main] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceInfoflow - Source lookup done, found 163 sources and 628 sinks.
[Service Thread] INFO soot.jimple.infoflow.memory.MemoryWarningSystem - Triggering memory warning at 2712 MB (2712 MB in tenured gen)...
[main] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceInfoflow - IFDS problem with 1682656 forward and 6192048 backward edges solved, processing 182 results...
[main] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceInfoflow - Current memory consumption: 2723 MB
[Service Thread] WARN soot.jimple.infoflow.memory.FlowDroidMemoryWatcher - Running out of memory, solvers terminated
[main] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceInfoflow - Memory consumption after cleanup: 810 MB
...
[main] INFO soot.jimple.infoflow.data.pathBuilders.BatchPathBuilder - Running path reconstruction batch 37 with 2 elements
[main] INFO soot.jimple.infoflow.data.pathBuilders.ContextSensitivePathBuilder - Obtainted 2 connections between sources and sinks
[main] INFO soot.jimple.infoflow.data.pathBuilders.ContextSensitivePathBuilder - Building path 1...
[main] INFO soot.jimple.infoflow.data.pathBuilders.ContextSensitivePathBuilder - Building path 2...
[main] INFO soot.jimple.infoflow.memory.MemoryWarningSystem - Shutting down the memory warning system...
[main] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceInfoflow - Memory consumption after path building: 810 MB
[main] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceInfoflow - Path reconstruction took 1 seconds

[main] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceInfoflow - The sink virtualinvoke $r0.<net.hockeyapp.android.FeedbackActivity: void startActivityForResult(android.content.Intent,int)>($r1, 2) in method <net.hockeyapp.android.FeedbackActivity: boolean addAttachment(int)> was called with values from the following sources:
[main] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceInfoflow - - $i0 := @parameter0: int in method <net.hockeyapp.android.FeedbackActivity: boolean onKeyDown(int,android.view.KeyEvent)>
[main] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceInfoflow - - $r1 := @parameter0: android.app.Activity in method <org.telegram.ui.Components.ForegroundDetector: void onActivityStarted(android.app.Activity)>
[main] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceInfoflow - - $r1 := @parameter0: android.app.Activity in method <org.telegram.ui.Components.ForegroundDetector: void onActivityStopped(android.app.Activity)>
...
[main] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceInfoflow - Data flow solver took 395 seconds. Maximum memory consumption: 799 MB
[main] INFO soot.jimple.infoflow.android.SetupApplication - Found 77 leaks

How I got it to work? Toy with settings! It worked after toying around with all settings. The full command I have now is

java -Xmx<ramamount>g -jar soot-infoflow-cmd-jar-with-dependencies.jar \
-a <apkpath> \
-aa FLOWSENSITIVE \
-al 500 \
-cg AUTO \
-cs ALL \
-ds CONTEXTFLOWSENSITIVE \
-i ALL \
-mc 500 \
-md 500 \
-t <IccTa EasyTaintWrapperSource.txt> \
-tw MULTI \
-sf CONTEXTFLOWSENSITIVE \
-r \
-pa CONTEXTSENSITIVE \
-s SourcesSinks_FlowDroid.txt \
-p <path to Android-SDK/platforms/> \
-o <resultpath>

When using FlowDroid's SourcesSinks.txt file, I managed to get 5/10 working. With IccTa's SourcesSinks.txt appended to Flowdroid's, I got 7/10.

It would seem that -cs ALL and -i ALL together produce above output are very important to get any results at all.

If you, @StevenArzt , have any other idea about how to get the percentage of working apk analyses up, please let me know. Also, maybe an update of your readme is in place (the Running The Command-Line Tool part). Using default options specified there with default FlowDroid SourcesSinks file, 0/10 apks got analysed. No sources could be found with those settings, for any apk I fed FlowDroid.

Also, small question:

[main] WARN soot.jimple.infoflow.android.SetupApplication$InPlaceInfoflow - Running with limited join point abstractions can break context-sensitive path builders

I don't like warnings as they usually means things will not work as intended. This warning here says I have specified limited join point abstractions (which I have not specified as far as I know). Do you know how I can tell FlowDroid to run with unlimited join path abstractions? Maybe it will help analyzing more apks correctly.

Happy Easter and thanks for the help and replies so far!

StevenArzt commented 5 years ago

I will have to see whether I can get access to Androzoo. As far as I know, we don't have that collection at the institute right now.

The "optimal" settings greatly depend on the apps you need to analyze. The -i option, for example, enables implicit flows. There is a pretty good paper on implicit flows. It's called "Implicit Flows: Can't Live with 'Em, Can't Live without 'Em", and that's for a reason. When analyzing goodware, implicit flow tracking will usually give you a lot of flows that are technically correct, but not of any interest to the analyst. Every login procedure, for example, leaks one bit about the password, i.e., whether the entered one was correct or not. This is an implicit leak, albeit not an interesting one. For larger apps, you will end up with hundreds of these leaks, which also drives up your time and memory consumption. Therefore, implicit flows are disabled by default.

The callback source mode handles how FlowDroid treats data that is passed into the app through callback parameters. By default, FlowDroid only takes explicitly-denoted parameters. With your settings, FlowDroid takes all parameters in all callback methods, which is a great over-approximation. This may lead to more leaks, but also more false positives. For example, an app may pass data from an irrelevant parameter such as a flag to a sink, and this will be reported as well. Additionally, tracking more taints also increases time and memory consumption.

I guess you are dealing with a special situation, because you are analyzing malware.

Sebastiaan-Alvarez-Rodriguez commented 5 years ago

Hi! Androzoo folks are quite friendly I think, and lend access to researchers!

I should have known better than just play around with settings. Analysis, especially with FlowDroid, takes so much parameter finetuning and settings and configs. I hope I have the configs good now. As for the settings: Thanks for the warning about my implicit flow and callback source mode settings. I will continue to try out other setting combinations to see which runs best. Perhaps it is a good idea to include a few benign apk's too, so I won't overfit like I did just now.

Yes, I suppose analyzing malware is a somewhat special situation. On the other hand, tools supporting flow sensitive taint analysis for Android show promise to detect malware. IccTa with IC3, per example, performs very well. I believe FlowDroid has this ability as well (as it is an more up-to-date version of IccTa right?). Yet, I need to find an acceptable configuration for this ability.

Do you have ideas to let FlowDroid perform like IccTa with IC3? Or should I just

Eshita66 commented 6 months ago

Hi I am just beginner to use flowdroid. I want to find security settings from an APK. Is it possible to find out the security settings flow by using flowdroid? if so how to do that?