soot-oss / soot

Soot - A Java optimization framework
GNU Lesser General Public License v2.1
2.87k stars 708 forks source link

Is there any Interface for soot to analyse Android jar/dex file? #352

Open WangYeOtw opened 9 years ago

WangYeOtw commented 9 years ago

When we use soot to do a Android app static analysis, we hope it's code coverage maximum close to 100%. While, in many apps, they have many jar/dex files except for main dex file "classes.dex". So , I wonder , may be we need a Interface to deal these files .

ericbodden commented 9 years ago

Sorry but what is your question?

StevenArzt commented 9 years ago

I guess your question is related to dynamically loaded code. If the APK file contains a secondary dex file which is loaded at runtime by the main dex file, we will overlook this second file.

While Soot does not support such constructs out-of-the-box, we have everything we need for at least a simplistic approach: Analyze the APK file with Soot as usual, then extract the secondary dex file(s) from the APK and run Soot on them as well. Soot can not only analyze APKs, but also single dex files.

The main problem are the interactions between the dex files. This is necessarily happening using reflection, so we're back in the old game of dealing with reflection which is a topic of its own.

ericbodden commented 9 years ago

A question: So how are there other dex files loaded into the app? Is this using a custom classloader and reflection? I am just asking because I am interested in a general solution.

Cheers, Eric

On 04.02.2015, at 13:09, Steven Arzt notifications@github.com wrote:

I guess your question is related to dynamically loaded code. If the APK file contains a secondary dex file which is loaded at runtime by the main dex file, we will overlook this second file.

While Soot does not support such constructs out-of-the-box, we have everything we need for at least a simplistic approach: Analyze the APK file with Soot as usual, then extract the secondary dex file(s) from the APK and run Soot on them as well. Soot can not only analyze APKs, but also single dex files.

The main problem are the interactions between the dex files. This is necessarily happening using reflection, so we're back in the old game of dealing with reflection which is a topic of its own.

— Reply to this email directly or view it on GitHub.

StevenArzt commented 9 years ago

Usually, that's how it works. Android has a class called "DexClassLoader" that you can use to do the actual loading. Then you can just fetch the classes via reflection.

ericbodden commented 9 years ago

Usually, that's how it works. Android has a class called "DexClassLoader" that you can use to do the actual loading. Then you can just fetch the classes via reflection.

So then the actual problem we need to solve is "simply" one of intepreting the reflection calls correctly. As you Steven know, Spark already has some limited support for this. We should figure out why that support does not work in this particular case and extend it as needed.

I wonder whether Karim's ICA group could look into this. They work on the handling of reflection as we speak.

Best wishes, Eric

StevenArzt commented 9 years ago

Sadly, there is more to it: Soot would need to find out how the second dex file is loaded (get its file name), load it, and merge its contents into the Scene. This is something we would need to implement, this is not there yet. Instead of trying to get the names of the other dex files from the code, we could also just mere all dex files in the APK into the Scene - but then we could potentially also merge in ones that are never loaded - and would miss the ones whose name does not end with ".dex". Malware applications often hide secondary dex files somewhere in images, etc.

I guess an option "merge all dex files in APK into Scene" would make sense as a first start.

WangYeOtw commented 9 years ago

Big love to you , Eric , Steven, I'm so happy to see you plan to mere all dex files in the APK into the Scene , And which will increase the code coverage during Android app static analysis.

zyrikby commented 9 years ago

To overcome the problem of DCL and reflection, you may use an approach similar to what we did in our tool called StaDynA. In this work as a static analyzer I used AndroGuard, however, it would be nice to have such functionality available for Soot because it is more mature.