haskell / criterion

A powerful but simple library for measuring the performance of Haskell code.
http://www.serpentine.com/criterion
BSD 2-Clause "Simplified" License
503 stars 86 forks source link

Presence of double quotes in benchmark name produces broken HTML report #235

Closed adamgundry closed 2 years ago

adamgundry commented 3 years ago

Similarly to #202 and #204, calling bench with a string containing double quotes leads to a HTML report that produces a JavaScript error in the browser console. Tested with criterion 1.5.9.0.

For example, generating a HTML report with this variant of the code from #202 illustrates the issue:

module Main where

import Criterion
import Criterion.Main

main :: IO ()
main = defaultMain
    [ env (return ()) $
       \ ~() -> bgroup "\"oops\"" [bench "dummy" $ nf id ()]
    ]
RyanGlScott commented 3 years ago

This is a regression that was introduced in #232, which changed the way that JSON is escaped. cc'ing @considerate, the author.

considerate commented 3 years ago

@RyanGlScott @adamgundry I see, the escaping of \ with respect to HTML, converting it to \u005c, broke escaping with respect to JSON. The easy fix would be to remove https://github.com/haskell/criterion/blob/2fdd87ba2bc3bef5af7081feaed8e6cb2735bf8d/Criterion/Report.hs#L132 but I'm not sure if that could cause issues with the report data escaping the HTML tag.

considerate commented 3 years ago

I'll have to come back to this some other day when I'm more well rested. However, I've verified that the offending field parses correctly in the same browser that fails on the full report data:

Running this in the developer console of both chromium and firefox works for me: JSON.parse('[{"reportName":"\u005c"oops\u005c"\u002fdummy","reportOutliers":{"highSevere":0,"highMild":0,"lowMild":0,"samplesSeen":43,"lowSevere":0}}]')

EDIT: This may be because JavaScript is unescaping the unicode escape sequences before JSON.parse is called. The JSON standard requires the escape sequence \" to be exactly those two characters, that is \u005c" won't get unescaped to \" before evaluating to the double quote character (https://tools.ietf.org/html/rfc8259 section 7). This means, that removing line 132 is likely motivated, my only concern is that it might allow for injecting HTML (I need to refresh my memory on what is an escape sequence in HTML).

considerate commented 3 years ago

I think the same argument applies for the escape sequences \n \r \b \t, and \/ so this means that we should probably remove lines 128 through 132 from Report.hs unless these characters can be used to escape the HTML tag somehow, however I don't think that's possible.

RyanGlScott commented 2 years ago

I've uploaded criterion-1.5.12.0 to Hackage with a fix.