russross / blackfriday

Blackfriday: a markdown processor for Go
Other
5.42k stars 598 forks source link

Square brackets in fragment #568

Open ghost opened 4 years ago

ghost commented 4 years ago

This code:

package main
import (
   "gopkg.in/russross/blackfriday.v2"
   "fmt"
)
func main() {
   b1 := []byte("http://nim-lang.github.io/Nim/system#echo,varargs[typed,]")
   s1 := blackfriday.Run(b1)
   fmt.Printf("%s\n", s1)
}

Produces this (newline added for readability):

<p><a href="http://nim-lang.github.io/Nim/system#echo,varargs[typed,">http://nim
-lang.github.io/Nim/system#echo,varargs[typed,</a>]</p>

The issue is the the final bracket is not included as part of the URL. The obvious comment at this point would be "just encode it", but I was curious what the standard says. RFC 3986 (2005) lists these characters under "gen-delims":

gen-delims  = ":" / "/" / "?" / "#" / "[" / "]" / "@"

which become part of "reserved":

reserved    = gen-delims / sub-delims

This part of the spec is important:

If data for a URI component would conflict with a reserved character's purpose as a delimiter, then the conflicting data must be percent-encoded before the URI is formed.

https://tools.ietf.org/html/rfc3986#section-2.2

So the question is, does ] in the fragment conflict with its purpose as a delimiter? Brackets are only used as part of an IP-literal:

https://tools.ietf.org/html/rfc3986#section-3.2.2

Example:

http://[::1]/

https://stackoverflow.com/questions/40189084/-/46711717

So its not possible for square brackets in a fragment to conflict, hence they do not need to be percent encoded. Note that the GitHub MarkDown parser handles this:

http://nim-lang.github.io/Nim/system#echo,varargs[typed,]