bcosca / fatfree

A powerful yet easy-to-use PHP micro-framework designed to help you build dynamic and robust Web applications - fast!
2.66k stars 446 forks source link

Base->clean() doesn't mitigate XSS/code injection attacks #850

Open Rayne opened 9 years ago

Rayne commented 9 years ago

Base->clean() doesn't mitigate XSS/code injection attacks as it doesn't remove malicious tag attributes.

$f3->route('GET /test/clean', function(Base $f3) {
    echo $f3->clean('<p><a href="http://127.0.0.1" onclick="alert(\'SURPRISE\');">Link</a></p>', 'a');
});

Generates:

<a href="http://127.0.0.1" onclick="alert('SURPRISE');">Link</a>

For the sake of completeness: Base->scrub() is also affected.

KOTRET commented 9 years ago

avoiding xss is more complex than removing characters. for output always use the esc-function in templates.

There is another bug in this method: if you pass * as 2nd arg, the tags should be removed. They are not, because the method only strips tags if the 2nd arg is not *

Rayne commented 9 years ago

If the * feature would be implemented as mentioned by @KOTRET, it would be superfluous and redundant. The documentation (also on the website) states, that all tags will be removed except the enumerated ones.

It could be possible that the * feature was originally designed to allow all tags and remove only non-printable characters.

ikkez commented 9 years ago

Hi. So your request is to add cleaning for onload and all other HTML events to the clean method?

Rayne commented 9 years ago

No, that would be a little ridiculous. I have different ideas and I think the second one is the best. Please share your ideas, too.

We also shouldn't forget to mention (or remove) the * feature.

F3 v4 could have three methods for cleaning values: one for removing non-enumerated tags and tag attributes, one for removing non-printable characters and one combining both (similar to Base->clean()).

Jiab77 commented 5 years ago

Hi all, first I'd like to thanks the devs for their great work, really impressed! Regarding this method, it should also encode characters to mitigate code passed using encoded strings...

This is what I'm using actually:

function sanitize($string, $html = false, $extra = false) {
    if (isset($string) && !empty($string)) {
        // Main clean
        if ($html === true) {
            $string = htmlentities(strip_tags(trim($string)), ENT_QUOTES, 'UTF-8');
        }
        else {
            $string = htmlspecialchars(strip_tags(trim($string)), ENT_QUOTES, 'UTF-8');
        }

        // Extra clean
        if ($extra === true) {
            // Potentialy dangerous items
            $danger = ["'", '"', '`', '../', '..\\', './', '.\\', 'javascript', ':', ';'];
            // Safe replacement
            $replace = '';
            // Cleaning
            $string = str_replace($danger, $replace, $string);
        }

        return $string;
    }
    else {
        return false;
    }
}

You can decide to encode the given string as html entities and / or remove other extra dangerous characters.

n0nag0n commented 4 years ago

I think this really only applies to fields where you need to keep html for some reason. If you need to keep html, then the clean() method is going to be very underwhelming. There is the HTMLPurifier library available if you need to have a lot of control over what html is used or not used. I've used it and highly recommend it although it is a big cumbersome.

A user (@Dabcorp) in the slack channel created their own xss class for cleaning data. See Repo Here. Still though, if you need fine grained control, HTML Purifier is the way to go.