Open lowk3v opened 4 years ago
This type of checks is better handled by more sophisticated tools that understand the data flow.
That being said, I think you can do something like this:
phpgrep hello.php '$x = ${"y:var"}[$_]; ${"*"}; echo $sink(${"*"}, $x, ${"*"});' 'y=$_GET'
The pattern above:
$x = ${"y:var"}[$_];
Matches the assignment to something (we name it $x
). RHS should contain any indexing expression over a
${"*"}
skips uninteresting things in between.
$sink(${"*"}, $x, ${"*"})
Matches any call-expr (method/func/etc) that consumes $x
as its argument (at any position).
y=$_GET
A filter that restricts $y
RHS to $_GET
variable.
Problems:
$x
is used in other contexts, we won't find it. The pattern above is bound to call expressions. So, when (1) is fixed, it can find $_GET
usage in this code:
$input = $_GET["src"];
$unrelated = $foo['blah'];
function f() {
}
f();
echo file_get_contents($input); // <- matched
echo file_get_contents($unrelated);
But it will not find anything here:
$input = $_GET["src"];
if ($whatever) {
echo file_get_contents($input);
}
Maybe we can figure out what syntax/CLI options would be used to achieve that, but I still think that this is a task for things similar to CodeQL (although I don't think it supports PHP).
file_get_contents( '../server_side/' . $_POST['src'] . '.php');
What is pattern for detect the code?
How to build a rule match with above situation?