Open gsingh93 opened 2 years ago
Thanks to @rvermeulen, I find that this solution works well:
/**
* @kind path-problem
*/
import cpp
import semmle.code.cpp.dataflow.TaintTracking
import semmle.code.cpp.controlflow.Dominance
import DataFlow::PathGraph
class Config extends TaintTracking::Configuration {
Config() { this = "Config" }
override predicate isSource(DataFlow::Node source) {
exists(Function f | f.getName() = "test" and source.asParameter() = f.getAParameter())
}
override predicate isSink(DataFlow::Node sink) {
exists(FunctionCall call | call.getTarget().getName() = "sink" |
call.getAnArgument() = sink.asExpr()
)
}
override predicate isSanitizer(DataFlow::Node sanitizer) {
exists(FunctionCall call | call.getTarget().hasName("sanitize") |
// Sanitizer variable accesses that are dominated by a call to the sanitizer
// function with the same variable as an argument to the sanitizer function
exists(SsaDefinition def, StackVariable v, VariableAccess dominatedUse |
call.getAnArgument() = def.getAUse(v) and
dominatedUse = def.getAUse(v) and
dominates(call, dominatedUse) and
dominatedUse = sanitizer.asExpr()
)
or
// Sanitize arguments to the sanitizer function
call.getAnArgument() = sanitizer.asExpr()
)
}
}
from Config c, DataFlow::PathNode source, DataFlow::PathNode sink
where c.hasFlowPath(source, sink)
select sink, source, sink, "Flow"
It works on this test case:
void sink(int);
void sanitize(int x) { sink(x); }
int transform1(int x) { return x + 1; }
int transform2(int x) { return x + 2; }
void foo(int x) { sink(x); }
void test(int x, int y) {
// Alert
sink(x);
// Alert
foo(x);
// Alert
sink(y);
// Alert
foo(y);
sanitize(x);
// No alert
sink(x);
// No alert
foo(x);
// Alert
sink(y);
// Alert
foo(y);
}
I'd like to change the outcome of this issue from "how to do this" to better documentation and examples for how to use SsaDefinition
and the dominates
predicate.
Update:
I'd like
SsaDefinition
and thedominates
predicate to be better documented with some actual examples. See https://github.com/github/codeql/issues/8568#issuecomment-1084951233 for some more context.Original Post:
I would like to remove/sanitize a path if the sink node was ever passed to the sanitization function sometime before reaching the sink. Here's an example of some C code I want to analyze:
If I use a simple
TaintTracking::Configuration
which tracks flows from the parameterx
to the argument tosink(x)
, I get four paths even if I use some type of sanitizer like this:It makes sense why this doesn't work: each path goes directly from the parameter
x
to the argument ofsink
orfoo
. There are no paths that go from the parameterx
, tosanitize
, and then to thesink
(that would be the case if the example hadsink(sanitize(x))
, but in this casesanitize
does not return a value).I can kind of get around this issue for cases where
sanitize
andsink
are in the function call like this:With this I only get three paths instead of four, but I still don't get the desired two paths because it doesn't handle the general case where the sink and sanitizer are in two different functions.
Is there any way to solve this? Does CodeQL store any paths between uses of a variables in addition to just the path from the definition of the variable to the use of it?