secure-software-engineering / FlowDroid

FlowDroid Static Data Flow Tracker
GNU Lesser General Public License v2.1
1.05k stars 298 forks source link

Is it possible to backwardly find out all the "sources" only given the sinks? #471

Closed RichardHoOoOo closed 2 years ago

RichardHoOoOo commented 2 years ago

Hi, I have read this thesis about a backward data flow analysis in FlowDroid. May I ask if ONLY given the sinks, is it possible to backwardly find out all the "sources" (e.g., the units that assign value to a tainted variable) using the backward data flow analysis in FlowDroid? In the following example, line 1-3 should all be the "sources".

a = ...
b = ...
c = a+b;
sink(c)

If possible, could you give me some instructions (e.g., which class I should take a look at or which method I should override)? Thanks in advance!

StevenArzt commented 2 years ago

By default, FlowDroid will backtrack from the sinks and for each statement that it encounters, check whether it is a source. If there is no source, FlowDroid will never report a leak.

To change this, you would need to remove the check that aborts the analysis if there are no sources. Next, you need to register a taint propagation handler to be notified whenever a taint arrives at a statement. Have a look at AbstractInfoflowProblem.

RichardHoOoOo commented 2 years ago

Hi @StevenArzt I guess the check you mentioned that I should remove is https://github.com/secure-software-engineering/FlowDroid/blob/511a3ff79cddcdc50e43a5dc4ca7aabb8a270743/soot-infoflow/src/soot/jimple/infoflow/AbstractInfoflow.java#L770-L773

Am I right?

RichardHoOoOo commented 2 years ago

Hi @StevenArzt Sorry about one more question.

It seems I need to use the development branch rather than the 2.10 release right? Because I found the backward implementation was merged on May 6 2022, which was 2 days after the release day of 2.10.

RichardHoOoOo commented 2 years ago

It turn out the part I should comment out is https://github.com/secure-software-engineering/FlowDroid/blob/511a3ff79cddcdc50e43a5dc4ca7aabb8a270743/soot-infoflow/src/soot/jimple/infoflow/AbstractInfoflow.java#L774-L777 because in backwards mode, sources are treated as sinks and vice versa.

Olasergiolas commented 1 year ago

@RichardHoOoOo I'm also really interested in this functionality, is it already implemented in your fork? (https://github.com/RichardHoOoOo/FlowDroid). I managed to compile it but it just crashes while performing the APK analysis.

imagen

Have a nice day!

RichardHoOoOo commented 1 year ago

@Olasergiolas If you want this functionality, I suggest you to see https://github.com/secure-software-engineering/FlowDroid/issues/471#issuecomment-1148664924. All you need to do is to comment out those lines and compile FlowDroid. The backward analysis has already been implemented in FlowDroid.

Olasergiolas commented 1 year ago

@Olasergiolas If you want this functionality, I suggest you to see #471 (comment). All you need to do is to comment out those lines and compile FlowDroid. The backward analysis has already been implemented in FlowDroid.

I applied your suggested changes and recompiled the project. imagen

Nevertheless, I don't get any results when analysing the APK found in soot-infoflow-android/testAPKs/SourceSinkDefinitions/SourceToSink3.apk. The sources and sinks file has been set as shown below: imagen And the code for the used example: imagen

The following command was used to start the analysis via CLI:

$ java -jar -Xmx2G soot-infoflow-cmd/target/soot-infoflow-cmd-jar-with-dependencies.jar -d -a ../FlowDroid/soot-infoflow-android/testAPKs/SourceSinkDefinitions/SourceToSink3.apk -p ~/Android/Sdk/platforms -s ~/Apps/FlowDroid/bin/test_mod.txt -o test.xml -cp -dir BACKWARDS

And lastly, this is the result I'm getting: imagen

Am I doing anything wrong? If I define both the source and sink everything works fine. Thanks in advance! ❤️

t1mlange commented 1 year ago

Nevertheless, I don't get any results when analysing the APK found in soot-infoflow-android/testAPKs/SourceSinkDefinitions/SourceToSink3.apk. The sources and sinks file has been set as shown below: imagen

Your source and sink file only contains a sink...

And lastly, this is the result I'm getting: imagen

...and thus, FlowDroid runs but doesn't know when to write out a result. You need to add some code (or configuration lines) to tell FlowDroid to write out a found flow.

[...snip...]

$ java -jar -Xmx2G soot-infoflow-cmd/target/soot-infoflow-cmd-jar-with-dependencies.jar -d -a ../FlowDroid/soot-infoflow-android/testAPKs/SourceSinkDefinitions/SourceToSink3.apk -p ~/Android/Sdk/platforms -s ~/Apps/FlowDroid/bin/test_mod.txt -o test.xml -cp -dir BACKWARDS

Fyi, with 2GB of memory you might not be able to fully analyze real-world apps.

RichardHoOoOo commented 1 year ago

@Olasergiolas It depends on what "result" you want. In my case, I register a TaintPropagationHandler to monitor how taint is backwardly propagated. If you run FlowDroid via CLI, of course you will not get any "result" since source is not set.

Olasergiolas commented 1 year ago

Nevertheless, I don't get any results when analysing the APK found in soot-infoflow-android/testAPKs/SourceSinkDefinitions/SourceToSink3.apk. The sources and sinks file has been set as shown below: imagen

Your source and sink file only contains a sink...

And lastly, this is the result I'm getting: imagen

...and thus, FlowDroid runs but doesn't know when to write out a result. You need to add some code (or configuration lines) to tell FlowDroid to write out a found flow.

* If you know the sources beforehand, you can specify them either in the simple text format or in the more expressive XML format.

* If the XML format (method signatures + target variable) isn't expressive enough for you, you can overwrite the `getSourceInfo()` and `getInverseSourceInfo()` methods of the used `SourceSinkManager` and implement your own logic.

* If you do not know the sources beforehand, you can write your own `TaintPropagationHandler`. These get notified about each edge in the exploded supergraph.

[...snip...]

$ java -jar -Xmx2G soot-infoflow-cmd/target/soot-infoflow-cmd-jar-with-dependencies.jar -d -a ../FlowDroid/soot-infoflow-android/testAPKs/SourceSinkDefinitions/SourceToSink3.apk -p ~/Android/Sdk/platforms -s ~/Apps/FlowDroid/bin/test_mod.txt -o test.xml -cp -dir BACKWARDS

Fyi, with 2GB of memory you might not be able to fully analyze real-world apps.

I am trying to accomplish the same goal described in the original posted issue, that is finding out all the sources only given the sinks. That's why I only provided a sink for the example. That being said, I did not realize that it was necessary to create my own TaintPropagationHandler so I will start working on that now.

Thank you both for your help! @RichardHoOoOo @timll

jacobocasado commented 1 year ago

Hello, I am in the same situation as @Olasergiolas and @RichardHoOoOo . I want to obtain all of the "sources" (without declaring them) that arrive into a sink. Reading the proposed solutions, I am a little bit lost into registering a taint propagation handler that notifies whenever a taint arrives at a statement.

@StevenArzt @timll Could you please provide some insights on how to perform this operation? Thank you very much!

t1mlange commented 1 year ago

Could you please provide some insights on how to perform this operation? Thank you very much!

A good way to see how FlowDroid can be invoked is to look in the cmd subproject or on how we invoke our test cases. Here, for example, I'm using the TaintPropagationHandler to check a property of the propagated abstractions. https://github.com/secure-software-engineering/FlowDroid/blob/5157bedaa530cb461dc7d39da6a480cfeb9009b6/soot-infoflow/test/soot/jimple/infoflow/test/junit/HeapTests.java#L1302-L1319

jacobocasado commented 1 year ago

I am using the tool over CLI, as declared in the documentation (the JAR file used is /FlowDroid/soot-infoflow-cmd/target/soot-infoflow-cmd-jar-with-dependencies.jar. I have understand that I have to set up a new notification that is True when the taint is equal to an statement, but I am a little bit lost on which part of the source code do I have to add this override functionality of the handler. According to @StevenArzt, would it have to be in the AbstractInfoflowProblem class?

t1mlange commented 1 year ago

I am using the tool over CLI, as declared in the documentation (the JAR file used is /FlowDroid/soot-infoflow-cmd/target/soot-infoflow-cmd-jar-with-dependencies.jar. I have understand that I have to set up a new notification that is True when the taint is equal to an statement, but I am a little bit lost on which part of the source code do I have to add this override functionality of the handler. According to StevenArzt, would it have to be in the AbstractInfoflowProblem class?

You have to use FlowDroid as a library. You can't remove the check mentioned in https://github.com/secure-software-engineering/FlowDroid/issues/471#issuecomment-1148664924 using a config flag. After that, you just register your own TaintPropagationHandler as in the snippet I referenced.

jacobocasado commented 1 year ago

I understand now the point of implementing my own TaintPropagationHandler. I have registered my own TaintPropagationHandler, and I am now tying to check whether a taint "arrives" to an statement, as proposed by @StevenArzt. I am trying to access the actual statement of the taint with taint.getCurrentStmt() and comparing it to the statement received by the handler, but I am not sure if that is the operation I should perform or if I need to retrieve more information. I don't get any matches when I perform the statement retrieved by taint.getCurrentStmt() and the statement received by the handler.

t1mlange commented 1 year ago

I understand now the point of implementing my own TaintPropagationHandler. I have registered my own TaintPropagationHandler, and I am now tying to check whether a taint "arrives" to an statement, as proposed by @StevenArzt. I am trying to access the actual statement of the taint with taint.getCurrentStmt() and comparing it to the statement received by the handler, but I am not sure if that is the operation I should perform or if I need to retrieve more information. I don't get any matches when I perform the statement retrieved by taint.getCurrentStmt() and the statement received by the handler.

I'm uncertain whether comparing the statement of the taint and the statement in the TaintPropagationHandler makes sense at all.

The statements (currentStmt and correspondingCallSite) within a taint do not describe the current statement but the statement where the taint was derived. These are used to later, after the data flow analysis, reconstruct a context-sensitive path through the program. The actual current statement is the Unit stmt passed to the notifyFlowIn and notifyFlowOut methods and a taint arrives at a statement if your TaintPropagationHandler is called.

jacobocasado commented 1 year ago

Alright, I have set the TaintPropagationHandler in my code but I still don't know how to add the resulting edges obtained in this handler as "sources" so they are processed by the solver in the calculations. I am relatively sure that this change would need to be in AbstractInfoflow, in the RunTaintAnalysis method, but I have not been capable of adding the results into the calculations.

jacobocasado commented 1 year ago

What I meant with this, is that, once forwardSolver.solve() is called, my handler is now printing all the taints that arrive into a statement (example):

[FlowDroid] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceBackwardsInfoflow - Abstract Statement: specialinvoke $r0.<de.testApp.sourcetosink4.MainActivity: void <init>()>() [FlowDroid] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceBackwardsInfoflow - Abstract Taint: $r0(de.testApp.sourcetosink4.MainActivity) <de.testApp.sourcetosink4.MainActivity: java.lang.String a> * <+length> | virtualinvoke r0.<de.testApp.sourcetosink4.MainActivity: void sink()>()>> [FlowDroid] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceBackwardsInfoflow - Abstract Method: <dummyMainClass: de.testApp.sourcetosink4.MainActivity dummyMainMethod_de_testApp_sourcetosink4_MainActivity(android.content.Intent)> [FlowDroid] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceBackwardsInfoflow - Abstract Statement: return [FlowDroid] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceBackwardsInfoflow - Abstract Taint: r0(de.testApp.sourcetosink4.MainActivity) <de.testApp.sourcetosink4.MainActivity: java.lang.String a> * <+length> | virtualinvoke r0.<de.testApp.sourcetosink4.MainActivity: void sink()>()>> [FlowDroid] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceBackwardsInfoflow - Abstract Method: <de.testApp.sourcetosink4.MainActivity: void <init>()> [FlowDroid] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceBackwardsInfoflow - Abstract Statement: r0.<de.testApp.sourcetosink4.MainActivity: java.lang.String a> = "abc" [FlowDroid] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceBackwardsInfoflow - Abstract Taint: r0(de.testApp.sourcetosink4.MainActivity) <de.testApp.sourcetosink4.MainActivity: java.lang.String a> * <+length> | virtualinvoke r0.<de.testApp.sourcetosink4.MainActivity: void sink()>()>> [FlowDroid] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceBackwardsInfoflow - Abstract Method: <de.testApp.sourcetosink4.MainActivity: void <init>()> [FlowDroid] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceBackwardsInfoflow - Abstract Statement: specialinvoke $r0.<de.testApp.sourcetosink4.MainActivity: void <init>()>() [FlowDroid] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceBackwardsInfoflow - Abstract Taint: $r0(de.testApp.sourcetosink4.MainActivity) <de.testApp.sourcetosink4.MainActivity: java.lang.String a> * <+length> | virtualinvoke r0.<de.testApp.sourcetosink4.MainActivity: void sink()>()>> [FlowDroid] INFO soot.jimple.infoflow.android.SetupApplication$InPlaceBackwardsInfoflow - Abstract Method: <dummyMainClass: de.testApp.sourcetosink4.MainActivity dummyMainMethod_de_testApp_sourcetosink4_MainActivity(android.content.Intent)>

The thing is that after the solve method, this gets printed, but I am not able to add these taints in a way that the flows are detected. My idea would be to add these taints so they are "used" in the calculations to detect the flows.

t1mlange commented 1 year ago

Alright, I have set the TaintPropagationHandler in my code but I still don't know how to add the resulting edges obtained in this handler as "sources" so they are processed by the solver in the calculations. I am relatively sure that this change would need to be in AbstractInfoflow, in the RunTaintAnalysis method, but I have not been capable of adding the results into the calculations.

You can add abstractions to the results here: https://github.com/secure-software-engineering/FlowDroid/blob/cddc85dd01eaf822e80ada62b6b69d6dd70f2b4a/soot-infoflow/src/soot/jimple/infoflow/problems/TaintPropagationResults.java#L52-L58

These results then get fed into the path builder, which tries to transform the abstraction graph to a concrete path in the program. This path is then printed to stdout by default. Though, you will probably get so many results in any non-toy app such that the path builder will take hours to complete.

jacobocasado commented 1 year ago

I understand I should use this TaintPropagationResults.addResult method to add the taints that are notified by the TaintPropagationHandler, but I can't access the TaintPropagationResults object while I am notifying a flow in the handler. Is that what you try to suggest me to do? Or I should use this addResult method in other sections and it has nothing to be with the handler notifications?

t1mlange commented 1 year ago

I understand I should use this TaintPropagationResults.addResult method to add the taints that are notified by the TaintPropagationHandler, but I can't access the TaintPropagationResults object while I am notifying a flow in the handler. Is that what you try to suggest me to do? Or I should use this addResult method in other sections and it has nothing to be with the handler notifications?

You can access the TaintPropagationResults via manager.getMainSolver().getTabulationProblem().getResults(). But as I stated above, I don't think it is feasible to build all paths. Can you tell me what you want to achieve (e.g. test for a specific vulnerability)? Currently, I'm uncertain whether FlowDroid is the right tool for your use case.

jacobocasado commented 1 year ago

First of all, thank you for all of your help! My idea is to use FlowDroid so, given an APK and a list of sinks, without specifying any source, obtain all the sources used in the APK that end in the declared sinks (The regular output of FlowDroid is OK, as it shows the source when a path is found, so I don't want to even change the output of the tool). I just want to not specify a source, as the idea is to obtain them at runtime. I think the change is small, but I don't know where to apply it. I am pretty lost, and if you want to keep helping and you don't want to maintain this conversation here, we can contact via e-mail, or another source. Thank you again!

jacobocasado commented 1 year ago

I have tried your approach (although you said it it not feasible) by adding a new AbstractionAtSink in the addResults method when a handler is called, but I do not manage to instantiate the first parameter needed to create the AbstractionAtSink object which is an ISourceSinkDefinition- I have been debugging the application and this object contains information of the source that is obtained by parsing the XML or TXT file, but I do not have this information to instance it, Could I instantiate this type of object with the information of taint and stmt?

t1mlange commented 1 year ago

I just want to not specify a source, as the idea is to obtain them at runtime.

I don't understand your definition of "source". If you want to know all statements that contributed to the values flowing into a sink, you might be better off with a backwards slicer. If you have sources that can't be modelled with the simple method signature and return/base/parameter specification (or depend on some other property you can't find out till you have loaded the app into Soot), then you might want to overwrite... https://github.com/secure-software-engineering/FlowDroid/blob/cddc85dd01eaf822e80ada62b6b69d6dd70f2b4a/soot-infoflow/src/soot/jimple/infoflow/sourcesSinks/manager/IReversibleSourceSinkManager.java#L14-L28 ...and return non-null if your complex definition of sources is fulfilled.

I have tried your approach (although you said it it not feasible) by adding a new AbstractionAtSink, but I do not manage to build the first parameter needed to create this object, which is an ISourceSinkDefinition

You can extend AbstractSourceSinkDefinition and provide your own definition.

jacobocasado commented 1 year ago

An example of what I want to archieve is that, declaring a sink of this type:

<method signature="&lt;javax.crypto.Cipher: void init(int,java.security.Key)&gt;">
            <param index="1" type="java.security.Key">
                <accessPath isSource="false" isSink="true" />
            </param>
        </method>

Obtain all the statements that contributed flowing into this java.security.Key parameter (for example, in this case, is to find hardcoded keys). The only modification that I would have to do is to delete the checks that are like: is this statement declared as source in the declarations file? What approach would I have to take?

t1mlange commented 1 year ago

An example of what I want to archieve is that, declaring a sink of this type:

<method signature="&lt;javax.crypto.Cipher: void init(int,java.security.Key)&gt;">
          <param index="1" type="java.security.Key">
              <accessPath isSource="false" isSink="true" />
          </param>
      </method>

Obtain all the statements that contributed flowing into this java.security.Key parameter (for example, in this case, is to find hardcoded keys). The only modification that I would have to do is to delete the checks that are like: is this statement declared as source in the declarations file? What approach would I have to take?

getInverseSourceInfo(Stmt stmt, InfoflowManager manager, AccessPath accessPath) {
   if stmt instanceof AssignStmt && rhs instanceof Constant && lhs matches accessPath
      return new SinkInfo(...)
}

You might want to apply some optimization pass that transforms byte array initialization into a String.getBytes(), otherwise you'll find a source for each byte.