secure-software-engineering / FlowDroid

FlowDroid Static Data Flow Tracker
GNU Lesser General Public License v2.1
1.03k stars 293 forks source link

The unstable propagation of the faster IFDS solver. #461

Closed DongjieHe closed 2 years ago

DongjieHe commented 2 years ago

Dear FlowDroid developers,

I am recently fixing the bug that causes Flowdroid to produce unstable results. I find that the forward solver may propagate both the active and inactive version of a path edge. For example, let <s, d1> -> <n, d2> be an active path edge and <s, d1> -> <n, -d2> be its inactive version.

The process order of the two edges may result in different sizes of the jumpFunctions and different propagationCount. Suppose the solver processes the active edge first, then when handling the inactive version, the existing implementation will not propagate the inactive edge anymore due to its lines 652-655.

However, if the solver processes the inactive edge first, then when handling the active path, the solver will still propagate the edge.

To avoid such redundant propagation, I suggest always saving the active version of a path edge into the jumpFunctions.

Our patching code is given below. It replaces the existing implementation of the propagate(D sourceVal, N target, D targetVal, N relatedCallSite, boolean isUnbalancedReturn) with the following code:

    protected void propagate(D sourceVal, N target, D targetVal,
            /* deliberately exposed to clients */ N relatedCallSite,
            /* deliberately exposed to clients */ boolean isUnbalancedReturn) {
        // Let the memory manager run
        if (memoryManager != null) {
            sourceVal = memoryManager.handleMemoryObject(sourceVal);
            targetVal = memoryManager.handleMemoryObject(targetVal);
            if (targetVal == null)
                return;
        }

        // Check the path length
        if (maxAbstractionPathLength >= 0 && targetVal.getPathLength() > maxAbstractionPathLength)
            return;

        D activeVal = targetVal.getActiveCopy();
        PathEdge<N, D> activeEdge = new PathEdge<>(sourceVal, target, activeVal);
        final PathEdge<N, D> edge = new PathEdge<>(sourceVal, target, targetVal);
        final D existingVal = addFunction(activeEdge);
        if (existingVal != null) {
            if (existingVal != targetVal) {
                // Check whether we need to retain this abstraction
                boolean isEssential;
                if (memoryManager == null)
                    isEssential = relatedCallSite != null && icfg.isCallStmt(relatedCallSite);
                else
                    isEssential = memoryManager.isEssentialJoinPoint(targetVal, relatedCallSite);

                if (maxJoinPointAbstractions < 0 || existingVal.getNeighborCount() < maxJoinPointAbstractions
                        || isEssential) {
                    existingVal.addNeighbor(targetVal);
                }
            }
        } else {
            scheduleEdgeProcessing(edge);
        }
    }

This patch works for two reasons: (1) all path edges handled by the backward solver are inactive. Recording their active versions does not affect the results. (2) the forward solver is only required to propagate a path edge or its inactive/active version exclusively once.

Please help to check the correctness of the above code. If it is correct, please help to merge it into the develop branch.

Thanks very much.

Dongjie He

StevenArzt commented 2 years ago

That makes a lot of sense to me. I'll change the implementation accordingly. Thanks a lot!

DongjieHe commented 2 years ago

Hi Steven,

The patch did not solve the problem fully. There still exist Applications that FlowDroid produces unstable results. And I am struggling with this issue. I have a few questions about the interaction of the forward and backward solvers. I wish your answers can help me find the solutions quickly.

  1. I am wondering about the physical meaning of injecting a path edge <s, d1> --> <n, d2> into the forward solver with both d1 and d2 being inactive.
  2. When the forward solver propagates a path edge, e.g., <s, d1>--><n, d2> with d2 being inactive and n is a call site, the forward solver will create a new path edge <sp, d3>-><sp, d3> into a corresponding callee method. suppose d3 is also inactive, then what is the physical meaning of <sp, d3> --> <sp, d3>? Could we enforce the first d3 (i.e., the fact at the source) to be active?
  3. I am a bit confused about the interaction between the forward and backward solver, especially the injectCallingContext method. Can you give a few words here to help me better understand this interaction?

Thanks in advance.

Dongjie He