secure-software-engineering / FlowDroid

FlowDroid Static Data Flow Tracker
GNU Lesser General Public License v2.1
1.02k stars 292 forks source link

APK Instrumentation: Issues with app startup when excluding androidx.* #693

Open beerphilipp opened 5 months ago

beerphilipp commented 5 months ago

I am attempting to instrument an APK using the code snippet below. The analysis and instrumentation process is successful, and the APK runs on the device (API Level 34) without issues. However, when I add androidx.* to the excludeList, the instrumented app fails to start.

What might be the reason for this issue, and do you have any insights or tips on troubleshooting or resolving this problem?

Thank you in advance!

InfoflowAndroidConfiguration configuration = new InfoflowAndroidConfiguration();
configuration.getAnalysisFileConfig().setTargetAPKFile("apk_path");
configuration.getAnalysisFileConfig().setAndroidPlatformDir("android_jar");
configuration.setCallgraphAlgorithm(InfoflowAndroidConfiguration.CallgraphAlgorithm.SPARK);

SetupApplication app = new SetupApplication(configuration);
app.setSootConfig(((options, config) -> {
  options.set_output_dir("output_dir").toString());
  options.set_output_format(Options.output_format_dex);

  List<String> excludeList = new LinkedList<String>();
  excludeList.add("java.*");
  excludeList.add("sun.misc.*");
  excludeList.add("android.*");
  excludeList.add("com.android.*");
  excludeList.add("dalvik.system.*");
  excludeList.add("org.apache.*");
  excludeList.add("soot.*");
  excludeList.add("javax.servlet.*");
  excludeList.add("dalvik.*");
  excludeList.add("kotlin.*");
  excludeList.add("kotlinx.*");
  options.set_exclude(excludeList);

  options.set_no_bodies_for_excluded(true);
  options.set_allow_phantom_refs(true);
}));

app.constructCallgraph();

// analyze and instrument the APK

app.removeSimulatedCodeElements();
PackManager.v().writeOutput();
timll commented 5 months ago

Hi,

This issue should probably be filled against Soot, as FlowDroid does not implement the instrumentation part. But I think I know what's your problem. If you look at the app's decompiled source code, you'll find that java.* etc. are not part of your apps source code (they are located in the Android.jar on your device). On the other hand, androidx.*, android.support.*, kotlin.* and kotlinx.* are bundled in the APK itself. I guess that Soot honors the exclude list for the classes it writes to the disk. But that's easy to check: Did you decompile the instrumented APK and look whether androidx is still bundled in the APK?

timll commented 5 months ago

By the way, here are the relevant lines of code.

The PackManager only writes out the application classes: https://github.com/soot-oss/soot/blob/395b87897dbfe16768ba2432a8aae9b94cf1127d/src/main/java/soot/PackManager.java#L720-L722

But the Scene marks all excluded classes as library classes: https://github.com/soot-oss/soot/blob/395b87897dbfe16768ba2432a8aae9b94cf1127d/src/main/java/soot/Scene.java#L1983-L1985

beerphilipp commented 5 months ago

Thank you for the very rapid response! It seems like that's the problem, androidx.*is not part of the instrumented app. So I guess if I want to exclude the classes from analysis but still want to write them to the APK, I could iterate over all androidx. classes and set every class as an application class using Scene.v().getSootClass(className).setApplicationClass()

timll commented 5 months ago

You can use the Android.jar of Android Studio to find out which namespaces are available on the device. All namespaces excluded but not available in the Android.jar should be part of the app (unless you know that they are not used).

beerphilipp commented 5 months ago

That sounds reasonable, thanks! I am, however, running into another problem. Before writing the APK, I execute the following lines to set the desired classes as application classes:

Scene.v().getClasses().forEach(sootClass -> {
    if (sootClass.getName().startsWith("androidx.")) {
        sootClass.setApplicationClass();
    }
});

The bodies of the classes are missing in the instrumented APK, since I called options.set_no_bodies_for_excluded(true) after setting up FlowDroid (see 1st comment). While setting this option to false marks the classes as library classes and includes the method bodies, it results in FlowDroid including the classes for call graph construction. Is there a programmatic way to exclude those from analysis?

Thanks and please let me know if I should move this discussion to the Soot repo.

timll commented 5 months ago

You should be able to generate the call graph with no_bodies_for_excluded = true, run FlowDroid, then set no_bodies_for_excluded = false and resolve the required bodies with SootResolver.v().reResolve(sootClass, SootClass.BODIES).

fyi FlowDroid is actually independent of the call graph algorithm, Soot is responsible here too. FlowDroid generates, based on the manifest and in-code registered callbacks, dummy methods that represent the lifecycle behavior as Jimple code. Then, any off-the-shelf call graph algorithm can be used to generate a call graph for Android.