microcosm-cc / bluemonday

bluemonday: a fast golang HTML sanitizer (inspired by the OWASP Java HTML Sanitizer) to scrub user generated content of XSS
https://github.com/microcosm-cc/bluemonday
BSD 3-Clause "New" or "Revised" License
3.16k stars 175 forks source link

Closing anchor and font tags mixed up #18

Closed wingedpig closed 9 years ago

wingedpig commented 9 years ago

With the following sample program, in the output, the closing anchor tag is missing and in its place is an erroneous closing font tag:

package main

import (
    "fmt"

    "github.com/microcosm-cc/bluemonday"
)

func main() {
    in := `<font face="Arial">No link here. <a href="http://link.com">link here</a>.</font> Should not be linked here.`
    p := bluemonday.UGCPolicy()
    p.AllowAttrs("color").OnElements("font")
    fmt.Printf("'%s'\n", p.Sanitize(in))
}

Am I doing something wrong, or is this a bug?

manfer commented 9 years ago

bluemonday makes use of golang package html

Package html implements an HTML5-compliant tokenizer and parser.

So maybe bluemonday works only with HTML5. Font tag is not a valid HTML5 tag.

buro9 commented 9 years ago

It's not the HTML5 thing, I use the parsing of the golang package to tokenise but it should still output the correct tags.

This appears to be an error related to how empty tags are managed and matched. I'll look at it now.

buro9 commented 9 years ago

@wingedpig you should find this works now, please test it and get back to me if this is not working for you.

Thanks for reporting it.