xlab-uiuc / cflow

Flow-based configuration analysis
9 stars 3 forks source link

About an analysis on a taint propagation path in cflow #22

Open fanweneddie opened 3 years ago

fanweneddie commented 3 years ago

Hi! I am trying to analyze the accuracy of cflow. I run the command

./run.sh -a hadoop_common -i -s

and find there is a taint propagation path from source method getLong() to sink method interrupt(). The path is visualized as below. received_taint_b

In the picture, blue rectangles represent method and red arrows represent inter-procedural taint flow.(I carelessly left out an init() method before)

I trace this taint flow carefully and find that: The taint about blockSize actually goes into method interrupt(), but it may not be used by interrupt(). This is caused by the coarse rule of dealing with sink method.

To be specific, In method runCommand(), object errThread.stat.blockSize is actually tainted (Though cflow regards the taint as on object errThread.stat due to its 1-approximation of field access). Then, since errThread is a parameter in method call joinThread(errThread);, the taint can be propagated through method call according the rule of call flow:

call_flow_rule

In method joinThread(), cflow regards t.stat as tainted, so cflow believes that this taint can reach method interrupt(); That is because the base object t of method call t.interrupt() is tainted by the taint on t.stat, which is similar to the above case.

Actually, method interrupt() in class Thread in jdk-8, shown as below,

public void interrupt() {
        if (this != Thread.currentThread()) {
            checkAccess();

            // thread may be blocked in an I/O operation
            synchronized (blockerLock) {
                Interruptible b = blocker;
                if (b != null) {
                    interrupt0();  // set interrupt status
                    b.interrupt(this);
                    return;
                }
            }
        }

        // set interrupt status
        interrupt0();
    }

uses this object(which corresponds to t in method joinThread()).

Therefore, to be strict, we can say the taint on errThread flows into sink method interrupt(). However, I intuitively think that the judgment in if-statement

 if (this != Thread.currentThread()) 

has nothing to do with errThread.stat.blockSize. So the actual taint on errThread.stat.blockSize may have no effect in sink.

Maybe the reason behind is that: When reaching a sink method, the rules of cflow(shown below) only concentrate on whether the base object or the parameters of sink method is tainted, but they ignore analyzing whether the actual tainted object(may not equal to base object) is used in the sink.

private void visitSink(Set<Taint> in, Stmt stmt) {
        if (!stmt.containsInvokeExpr()) return;
        InvokeExpr invoke = stmt.getInvokeExpr();

        Value base = null;
        if (invoke instanceof InstanceInvokeExpr) {
            base = ((InstanceInvokeExpr) invoke).getBase();
        }

        for (Taint t : in) {
            // Process base object
            if (base != null && t.taints(base)) {
                Taint sinkTaint = Taint.getTransferredTaintFor(
                        t, t.getPlainValue(), stmt, method, currTaintCache);
                sinkTaint.setSink();
                sinks.add(sinkTaint);
            }

            // Process parameters
            for (int i = 0; i < invoke.getArgCount(); i++) {
                Value arg = invoke.getArg(i);
                if (t.taints(arg)) {
                    Taint sinkTaint = Taint.getTransferredTaintFor(
                            t, t.getPlainValue(), stmt, method, currTaintCache);
                    sinkTaint.setSink();
                    sinks.add(sinkTaint);
                }
            }
        }
fanweneddie commented 3 years ago

plus: Here I show the reason of the existence of errThread.stat.blockSize.

In method getNativeFileLinkStatus(), object stat is initialized with its field blockSize tainted by getDefaultBlockSize(f) . Later, stat calls method getFileStatus().

private FileStatus getNativeFileLinkStatus(final Path f,
      boolean dereference) throws IOException {
    checkPath(f);
    Stat stat = new Stat(f, getDefaultBlockSize(f), dereference, this);
    FileStatus status = stat.getFileStatus();
    return status;
  }

After entering the method getFileStatus(), this.blockSize(field blockSize for this object in method getFileStatus()) is tainted.

In method getFileStatus(), it calls run() to check whether it needs to run a command.

public FileStatus getFileStatus() throws IOException {
    run();
    return stat;
  }

After entering the method run(), this.blockSize(field blockSize for this object in method run()) is tainted.

Later, it calls runCommand() to run a command.

protected void run() throws IOException {
    if (lastTime + interval > Time.monotonicNow()) {
      return;
    }
    exitCode = 0; // reset for next run
    if (Shell.MAC) {
      System.setProperty("jdk.lang.Process.launchMechanism", "POSIX_SPAWN");
    }
    runCommand();
  }

After entering the method runCommand(), this.blockSize(field blockSize for this object in method runCommand()) is tainted.

In method runCommand(), it has an anonymous inner class as Shell$Thread:

private void runCommand() throws IOException {
    ...
    Thread errThread = new Thread() {
      @Override
      public void run() {
        try {
          String line = errReader.readLine();
          while((line != null) && !isInterrupted()) {
            errMsg.append(line)
                .append(System.getProperty("line.separator"));
            line = errReader.readLine();
          }
        } catch(IOException ioe) {
          // Its normal to observe a "Stream closed" I/O error on
          // command timeouts destroying the underlying process
          // so only log a WARN if the command didn't time out
          if (!isTimedOut()) {
            LOG.warn("Error reading the error stream", ioe);
          } else {
            LOG.debug("Error reading the error stream due to shell "
                + "command timeout", ioe);
          }
        }
      }
    };
    ...
}

This inner class requires three more fields:

Therefore, object errThread.stat.blockSize is tainted by this.blockSize(field blockSize for this object in method runCommand()).

tianyin commented 2 years ago

Thanks for tracking this down.

For the application (configuration analysis), we really have to make the propagation rule more strict than now. I don't think it is good to taint the entire object if one of the field is tainted. That could lead to huge over-tainting. Or, in other words, it should be applied very specifically and carefully to certain well-defined type (e.g., it would be fine to taint a file object for a file path configuration).

tianyin commented 2 years ago

To make it clear, cflow is not a generic static taint analysis tool. Instead, it is supposed to be a very specialized tool for configuration analysis. One of the key challenges here is to figure out the propagation rules that can be useful for configuration analysis.

With that being said, the propagation rule you stated above is not "wrong" -- it could be favored by other analysis (e.g., security) but does not help configuration analysis.

In short, the way to think about it is to think from the application perspective -- how can you support the analysis you want to support. Some of the analysis was listed in the original project desc (the Reference Reading section), https://docs.google.com/document/d/1aLmu3tbsuVFfU3J1zhks1LveBCmAe32rA3jvzZ5zv9A/edit

If you haven't read it, you will need to read it first. It give you the idea what are the basic applications you want to support (and you will need to support them later using cflow).

fanweneddie commented 2 years ago

Thank you for your guidance. I will read those papers to understand the concept better.