microcosm-cc / bluemonday

bluemonday: a fast golang HTML sanitizer (inspired by the OWASP Java HTML Sanitizer) to scrub user generated content of XSS
https://github.com/microcosm-cc/bluemonday
BSD 3-Clause "New" or "Revised" License
3.14k stars 176 forks source link

Use of strings.ToLower() incorrectly escapes chars and allows for insertion of scripts #56

Closed buro9 closed 6 years ago

buro9 commented 6 years ago

Reported by email:

package main

import (
    "fmt"

    "github.com/microcosm-cc/bluemonday"
)

func TestEncoding() {
    p := bluemonday.NewPolicy()

    original := "<scr\u0130pt>&lt;script>alert(/XSS/)&lt;/script>"
    html := p.Sanitize(original)

    // Output:
    // Original: <scrİpt>&lt;script>alert(/XSS/)&lt;/script>
    // Sanitized: <script>alert(/XSS/)</script>
    fmt.Printf("Original: %s\nSanitized: %s\n",
        original, html)
}

func main() {
    TestEncoding()
}

Note that this is more severe than even the original reporter realised as this works on the NewPolicy which is a blank policy.

An explanation was provided:

This one occurs because you doesn't escape script/style tag contents: https://github.com/microcosm-cc/bluemonday/blob/master/sanitize.go#L219-L228 The trick is using symbol \u0130 (İ) that converts to i by strings.ToLower, how to find it: https://play.golang.org/p/jDMRCSNigR7

Investigation reveals that strings.ToLower() was not even required, and could be omitted which results in the expected (safe) behaviour.

A change is coming in a moment.

Credit to Yandex and @buglloc for reporting this.