Happstack / happstack-server

An HTTP Server
BSD 3-Clause "New" or "Revised" License
84 stars 28 forks source link

unicode safety of percentDecode #25

Open claudeha opened 8 years ago

claudeha commented 8 years ago

http://hackage.haskell.org/package/happstack-server-7.4.6.1/docs/Happstack-Server-SURI.html#v:percentDecode

appears not to be unicode-safe - single codepoints can be encoded as multiple percents like %XX%YY%ZZ , probably using UTF-8, for example → is encoded as something that percentDecode mangles into → when viewed in my browser

probably the way to fix it would be to assume ASCII except for % and construct a ByteString to decode with Text decodeUtf8With (something that doesn't crash, using replacement characters) or similar

ddssff commented 8 years ago

I don't quite understand. Could you give a worked example of how this fails?

claudeha commented 8 years ago

percentDecode.hs for testing without having to install happstack-server:

import Data.Char

percentDecode :: String -> String
percentDecode [] = ""
percentDecode ('%':x1:x2:s) | isHexDigit x1 && isHexDigit x2 =
    chr (digitToInt x1 * 16 + digitToInt x2) : percentDecode s
percentDecode (c:s) = c : percentDecode s

main = do
  putStrLn "→"
  putStrLn (percentDecode "%E2%86%92") -- percent encoded "→" copy-pasted from browser address bar

output:

$ runghc percentDecode.hs
→
â
$ runghc percentDecode.hs | hd
00000000  e2 86 92 0a c3 a2 c2 86  c2 92 0a                 |...........|