vimeo / psalm

A static analysis tool for finding errors in PHP applications
https://psalm.dev
MIT License
5.5k stars 655 forks source link

Psalm's taint analysis behaves unexpectedly for different accesses to _POST #10919

Open talarcin opened 2 months ago

talarcin commented 2 months ago

Hey,

I am currently testing out Psalm's taint analysis feature. While testing it on some WordPress Plugins with known vulnerabilities I've seen that depending on the way the _POST array is accessed and values from it are passed to a custom taint sink, taint warnings are not reported consistently. The following code example should visualize the inconsistency.

<?php

/**
 * @psalm-taint-sink html $value
 *
 * Updates the value of an option that was already added.
 *
 * You do not need to serialize values. If the value needs to be serialized,
 * then it will be serialized before it is inserted into the database.
 * Remember, resources cannot be serialized or added as an option.
 *
 * If the option does not exist, it will be created.
 * This function is designed to work with or without a logged-in user. In terms of security,
 * plugin developers should check the current user's capabilities before updating any options.
 *
 * @param string $option Name of the option to update. Expected to not be SQL-escaped.
 * @param mixed $value Option value. Must be serializable if non-scalar. Expected to not be SQL-escaped.
 * @param string|bool $autoload Optional. Whether to load the option when WordPress starts up. For existing options,
 *                              `$autoload` can only be updated using `update_option()` if `$value` is also changed.
 *                              Accepts 'yes'|true to enable or 'no'|false to disable.
 *                              Autoloading too many options can lead to performance problems, especially if the
 *                              options are not frequently used. For options which are accessed across several places
 *                              in the frontend, it is recommended to autoload them, by using 'yes'|true.
 *                              For options which are accessed only on few specific URLs, it is recommended
 *                              to not autoload them, by using 'no'|false. For non-existent options, the default value
 *                              is 'yes'. Default null.
 * @return bool True if the value was updated, false otherwise.
 * @since 4.2.0 The `$autoload` parameter was added.
 *
 * @global wpdb $wpdb WordPress database abstraction object.
 *
 * @since 1.0.0
 */
function update_option($option, $value, $autoload = null)
{
}

$_POST['value'] = 'tainted_input';

$recognized_value = $_POST;
$new_nested_array_recognized = array($_POST);

$merged_array_unrecognized = array_merge(array(), $_POST);
$new_array_unrecognized = array('value' => $_POST['value']);
$unrecognized_value = $_POST['value'];

update_option('key', $recognized_value); // recognized
update_option('key', $new_nested_array_recognized); // recognized
update_option('key', $new_array_unrecognized); // unrecognized
update_option('key', $unrecognized_value); // unrecognized
update_option('key', $merged_array_unrecognized); // unrecognized

The update_option function is from WordPress and is writing options settings to the database. Psalm only detects the taint for $recognized_value and $new_nested_array_recognized. However, I would expect it to detect it for all five cases, since tainted HTML would be written to the database in each of them.

This is Psalm's output after running psalm --taint-analysis:

ERROR: TaintedHtml
at ~/GitHub/psalm-test/under-test/target/taint-two.php:49:22
Detected tainted HTML (see https://psalm.dev/245)
  $_POST
    <no known location>

  $recognized_value - under-test/target/taint-two.php:41:1
$recognized_value = $_POST;

  call to update_option - under-test/target/taint-two.php:49:22
update_option('key', $recognized_value); // recognized

ERROR: TaintedHtml
at ~/GitHub/psalm-test/under-test/target/taint-two.php:50:22
Detected tainted HTML (see https://psalm.dev/245)
  $_POST
    <no known location>

  array['0'] - under-test/target/taint-two.php:42:38
$new_nested_array_recognized = array($_POST);

  $new_nested_array_recognized - under-test/target/taint-two.php:42:1
$new_nested_array_recognized = array($_POST);

  call to update_option - under-test/target/taint-two.php:50:22
update_option('key', $new_nested_array_recognized); // recognized

------------------------------
2 errors found
------------------------------

Is this defined behavior or is there something not correct or incomplete with how Psalm handles the taint flow in this example?

Best Regards, Tuncay

psalm-github-bot[bot] commented 2 months ago

Hey @talarcin, can you reproduce the issue on https://psalm.dev? These will be used as phpunit tests when implementing the feature or fixing this bug.

orklah commented 2 months ago

reproducer: https://psalm.dev/r/887f950099

Note that I commented the value you assigned into $_POST because otherwise Psalm can tell it's not tainted.

It does leave 2 unrecognized cases that are definitely bugs

psalm-github-bot[bot] commented 2 months ago

I found these snippets:

https://psalm.dev/r/887f950099 ```php $_POST['value']); $unrecognized_value = $_POST['value']; update_option('key', $recognized_value); // recognized update_option('key', $new_nested_array_recognized); // recognized update_option('key', $new_array_unrecognized); // unrecognized update_option('key', $unrecognized_value); // unrecognized update_option('key', $merged_array_unrecognized); // unrecognized ``` ``` Psalm output (using commit 08afc45): ERROR: TaintedHtml - 34:33 - Detected tainted HTML ERROR: TaintedHtml - 34:33 - Detected tainted HTML ERROR: TaintedHtml - 34:33 - Detected tainted HTML ```