WordPress / WordPress-Coding-Standards

PHP_CodeSniffer rules (sniffs) to enforce WordPress coding conventions
MIT License
2.56k stars 487 forks source link

False alarm of unsanitized input variable? #2208

Closed smileBeda closed 1 year ago

smileBeda commented 1 year ago

I echo the value of $_GET['s'] on a search results page, like so: Search results for: <?php echo isset( $_GET['s'] ) ? esc_html( wp_unslash( $_GET['s'] ) ) : ''; ?>

WPCS tells me: Detected usage of a non-sanitised input variable: $_GET['s']

Indeed, it is not sanitised. It is escaped! Because it is echoed, we do not sanitise it: we do not store this anywhere. We echo it, thus, escape, and that is done.

PS yes, I know it also throws the error of "usage of GET without nonce verification", but that is out of scope of this issue.

Either once again I do not see the tree in the forest, or this is a false alarm I would appreciate it, if someone could confirm or deny

Thank you.

dingo-d commented 1 year ago

@smileBeda Does the same happen when you use develop branch? What version of WPCS are you using?

smileBeda commented 1 year ago

Stale version 2.3.0 wich is the only version available since 3 years now.

I have given up long time ago on "trying out dev versions", since each time I do that, I spend 5 hours fixing the mess and versioning back.

jrfnl commented 1 year ago

@smileBeda Input sanitization and output escaping are different principles, though yes, in this case, they could overlap. However, if this is the only code to work with, I'd still be missing some additional input validation/sanitization. I mean: you verify that $_GET['s'] exists, but not that it is usable for your purposes. What if $_GET['s'] is an array ? what if $_GET['s'] is an empty string ?

And now you may say: yeah, but I do the nonce verification elsewhere, I know it's coming from this particular form and that the form has a string input. True, but is your plugin/theme the only thing running on your site ? While hopefully this shouldn't happen, other plugins/themes could still adjust the data and write it back to $_GET['s'] (anyone reading this: please do not get any ideas, doing so is really bad practice!).

You probably will also want to use sanitize_text_field() to ensure potentially broken UTF8 fragments or octets get removed.

Does that help ?

smileBeda commented 1 year ago

@jrfnl , you are of course right with everything, yet, that exact same could be said about any $random_var that gets echoed, and the WPCS will not tell me to sanitise any other var when echoing it, only when storing it.

My point is, that if we echo we do not store, and if someone manages to put an array in the s GET param, an echo esc_html() will ... print out whatever of the array passes esc_html() (which is not really a lot). "Who cares?" Nothing broke, nothing bad got into the database, and it is exactly what was input to the GET, which is exactly what I want to print (without malicious code like script or else breaking tags.

esc_html already should take care of UTF8 (wp_check_invalid_utf8), and as well special chars (_wp_specialchars).

Literally WPCS here is basically asking me to do this:

// Of course you want to check nonces, unslash etc - not done here for the sake of example
$var = sanitize_{take your pick}( $_GET['s'] );// make sure it gets sanitised in all possible ways.
echo esc_html( $var ); // make sure it gets escaped in all possible ways.

Now it is both sanitised and escaped, yet, we wasted the resources in saving it unnecessarily into a variable, sanitising what does not need to be sanitized as it never gets stored.

I am not convinced that this alarm here is reasonable. In the end, it is echoing an escaped value, really not much can go wrong? I might miss the specific case where this is dangerous if not sanitized as opposed to any other "value" that gets echoed. Since sanitise also will not take care of empty strings, I see that argument as not a reason to use sanitisation. To avoid empty string I'd have to to a empty() check, which could be interpreted as a validation, but not sanitisation IMO?

GaryJones commented 1 year ago

that exact same could be said about any $random_var that gets echoed, and the WPCS will not tell me to sanitise any other var when echoing it, only when storing it.

For me, the difference is that $_GET is known to be an untrusted source, and therefore one which should be validated or sanitized immediately as it is accessed. This is best practice.

The fact that your code immediately echoes an escaped but unsanitized/unvalidated value is therefore not following a best practice. Why is it a best practice? Someone could access it, and do a bunch of other things with the value before echoing, or maybe the echoing happens in a different function, or a different file - WPCS wouldn't be able to track the latter (and not so easy the former), so WPCS highlights when a $_GET key is accessed without sanitization/validation. Maybe it's even different developers that write the access and echoing code, so WPCS needs to be as protective as possible here.

Now it is both sanitised and escaped, yet, we wasted the resources in saving it unnecessarily into a variable, sanitising what does not need to be sanitized as it never gets stored.

sanitize_text_field() does the following:

If I was displaying a "Search results for: ...", I know I'd want the value to be tidied up with those changes ready for escaping before display, even if it would only be the user who entered a malicious string to be affected.

At the end of the day, you're echoing an unsanitized/unvalidated value retrieved from an untrusted source, so the violation is not a false positive. If, for your quick echoing use, you disagree, then you can add a ignore comment to silence the reported violation.

jrfnl commented 1 year ago

Marking this as a candidate for closing as, as far as I can see, the original issue raised has been sufficiently answered.