github / codeql

CodeQL: the libraries and queries that power security researchers around the world, as well as code scanning in GitHub Advanced Security
https://codeql.github.com
MIT License
7.51k stars 1.49k forks source link

Java codeql requires Precise data-flow for collections and Field-sensitive data-flow analysis capabilities #9597

Open jn61129052 opened 2 years ago

jn61129052 commented 2 years ago

Like this issus:https://github.com/github/codeql/pull/3366

Java codeql engine cannot model data flow through collections precisely. Example:

class Demo
{
    private static final String PSFS = "1111";
    void M1(source)
    {
        Obj obj = new Obj();
        obj.setA(source);
        obj.setB(PSFS);
        Sink(obj.getA()); // vul
        Sink(obj.getB()); // not vul
    }
}

codeql marked obj is tainted,not obj.A

smowton commented 2 years ago

Can you share the code of class Obj and the CodeQL you used to test this?

jn61129052 commented 2 years ago

class obj:

public class Obj {
    public String code;
    public String cmd;
    public String getCode() {
        return code;
    }

    public void setCode(String code) {
        this.code = code;
    }

    public String getCmd() {
        return cmd;
    }

    public void setCmd(String cmd) {
        this.cmd = cmd;
    }

}

class demo:

class Demo
{
    private static final String code = "1111";
    void M1(source)
    {
        Obj obj = new Obj();
        obj.setCmd(source); //codeql tainted source->obj
        obj.setCode(PSFS);
        Sink(obj.getCode()); // not vul
    }
}

Codeql:

import java
import lib.taintTracking
import DataFlow::PartialPathGraph
import database.sink.CMDI

class CommandConfiguration extends SpringTaintConfiguration {
  CommandConfiguration() { this = "command" }
  override predicate isSink(DataFlow::Node node) { exists(CallSink sink | sink.match(node)) }
}

from
  CommandConfiguration config, DataFlow::PartialPathNode source, DataFlow::PartialPathNode sink
where
  config.hasPartialFlow(source, sink, _)
  and config.isSink(sink.getNode())
select sink.getNode(), source, sink, sink.getNode().asExpr().toString()

Codeql query result this code has vulnerability but it's not.Source need to tainted obj class field but not whole obj class,which is Field-sensitive data-flow analysis capabilities

smowton commented 2 years ago

This works as expected:

class Obj {
    public String code;
    public String cmd;
    public String getCode() {
        return code;
    }

    public void setCode(String code) {
        this.code = code;
    }

    public String getCmd() {
        return cmd;
    }

    public void setCmd(String cmd) {
        this.cmd = cmd;
    }

}

public class Demo
{
    private static final String code = "1111";
    public void M1(String source)
    {
        Obj obj = new Obj();
        obj.setCmd(source); //codeql tainted source->obj
        obj.setCode(code);
        sink(obj.getCode()); // No flow
    }

    public void M2(String source)
    {
        Obj obj = new Obj();
        obj.setCmd(source); //codeql tainted source->obj
        obj.setCode(code);
        sink(obj.getCmd()); // Has flow
    }

    public static void sink(String s) { }
}
import java
import semmle.code.java.dataflow.DataFlow

class CommandConfiguration extends DataFlow::Configuration {
  CommandConfiguration() { this = "command" }
  override predicate isSource(DataFlow::Node node) {
    node.asParameter().getName() = "source"
  }

  override predicate isSink(DataFlow::Node node) {
    node.asExpr() = any(MethodAccess ma | ma.getCallee().getName() = "sink").getAnArgument()
  }
}

from
  CommandConfiguration config, DataFlow::PathNode source, DataFlow::PathNode sink
where
  config.hasFlowPath(source, sink)
  and config.isSink(sink.getNode())
select sink.getNode(), source, sink, sink.getNode().asExpr().toString()

I can't tell where the problem is because I don't know what's in lib.taintTracking, SpringTaintConfiguration or CallSink, but one likely suspect is the use of DataFlow::PartialPathNode and hasPartialFlow rather than DataFlow::PathNode and hasFlowPath. The partial-path stuff is a debugging aid which will show you everywhere that taint can get to beginning at your sources and disregard your sinks; therefore it may be that whatever path you're looking at actually won't connect with a sink and only shows up in this debug view.