Serialization benchmarks are invalid

jberryman commented 8 years ago

We're not serializing anything in 2 ns ;)

I'm sorry I don't have a PR for you, as I didn't care about packman and ripped it out and added some of my own benchmarks. But this sort of thing should yield meaningful numbers:

class Serialize lib a where
    serialize :: lib -> a -> BS.ByteString
    deserialize :: lib -> BS.ByteString -> a

instance (B.Binary a, NFData a) => Serialize Binary a where
    serialize   _ = LBS.toStrict . B.encode
    deserialize _ = B.decode . LBS.fromStrict
...
        bgroup "serialization"
          [ bench "binary"  $ whnf (serialize Binary) tree
....

Thanks for putting this together; it was very helpful!

osa1 commented 8 years ago

Indeed, 2ns doesn't make any sense. I'm quite confused about why that benchmark can run in 2ns, so I'll spend some time trying to understand what's going on here. I'm clearly generating a strict ByteString and even forcing it. Weird.

osa1 commented 8 years ago

So I wrote this small file as a test:

{-# LANGUAGE DeriveDataTypeable, FlexibleContexts, FlexibleInstances,
             LambdaCase, MultiParamTypeClasses #-}

module Main where

import Criterion.Main
import Lib

runBench :: IO ()
runBench = do
  defaultMain
    [ env (generateBalancedTree 22) $ \tree ->
        bgroup "serialization"
          [
            -- bench "binary"      $ nf (serialize Binary) tree,
            bench "binary-io"   $ nfIO (return (serialize Binary tree))
          ]
    ]

main :: IO ()
main = runBench

This is the related STG:

Main.$wrunBench
  :: Lib.BinTree GHC.Types.Int
     -> (# GHC.Base.String, [Criterion.Types.Benchmark] #) =
    \r srt:SRT:[rA :-> Data.ByteString.Builder.toLazyByteString,
                rB :-> Data.ByteString.Lazy.toStrict,
                rP9 :-> Data.Binary.Class.$fBinaryInt, r2R3 :-> Main.main5,
                r2R6 :-> Main.main6] [w_sfvK]
        let {
          lvl3_sfvL :: () =
              \u srt:SRT:[rA :-> Data.ByteString.Builder.toLazyByteString,
                          rB :-> Data.ByteString.Lazy.toStrict,
                          rP9 :-> Data.Binary.Class.$fBinaryInt] []
                  let {
                    sat_sfvP :: Data.ByteString.Builder.Internal.Builder =
                        \u srt:SRT:[rP9 :-> Data.Binary.Class.$fBinaryInt] []
                            case Lib.$w$cput Data.Binary.Class.$fBinaryInt w_sfvK of _ {
                              (#,#) _ ww2_sfvO -> ww2_sfvO;
                            };
                  } in 
                    case
                        Data.ByteString.Builder.toLazyByteString sat_sfvP
                    of
                    sat_sfvQ
                    { __DEFAULT ->
                          case Data.ByteString.Lazy.toStrict sat_sfvQ of _ {
                            Data.ByteString.Internal.PS _ _ _ _ -> () [];
                          };
                    }; } in
        let {
          $wgo12_sfvW
            :: GHC.Prim.Int#
               -> GHC.Prim.State# GHC.Prim.RealWorld
               -> (# GHC.Prim.State# GHC.Prim.RealWorld, () #) =
              sat-only \r srt:SRT:[] [ww_sfvX w1_sfvY]
                  case <=# [ww_sfvX 0#] of sat_sfvZ {
                    __DEFAULT ->
                        case tagToEnum# [sat_sfvZ] of _ {
                          GHC.Types.False ->
                              case seq# [lvl3_sfvL w1_sfvY] of _ {
                                (#,#) ipv_sfw2 _ ->
                                    case -# [ww_sfvX 1#] of sat_sfw4 {
                                      __DEFAULT -> $wgo12_sfvW sat_sfw4 ipv_sfw2;
                                    };
                              };
                          GHC.Types.True -> (#,#) [w1_sfvY GHC.Tuple.()];
                        };
                  }; } in
        let {
          sat_sfw9 :: Criterion.Types.Benchmarkable =
              \r srt:SRT:[] [eta1_sfw5 eta2_sfw6]
                  case eta1_sfw5 of _ {
                    GHC.Int.I64# ww1_sfw8 -> $wgo12_sfvW ww1_sfw8 eta2_sfw6;
                  }; } in
        let {
          sat_sfwa :: Criterion.Types.Benchmark =
              NO_CCS Criterion.Types.Benchmark! [Main.main5 sat_sfw9]; } in
        let {
          sat_sfwb :: [Criterion.Types.Benchmark] =
              NO_CCS :! [sat_sfwa GHC.Types.[]];
        } in  (#,#) [Main.main6 sat_sfwb];

Important part is, this is the part that serializes the tree:

        let {
          lvl3_sfvL :: () =
              \u srt:SRT:[rA :-> Data.ByteString.Builder.toLazyByteString,
                          rB :-> Data.ByteString.Lazy.toStrict,
                          rP9 :-> Data.Binary.Class.$fBinaryInt] []
                  let {
                    sat_sfvP :: Data.ByteString.Builder.Internal.Builder =
                        \u srt:SRT:[rP9 :-> Data.Binary.Class.$fBinaryInt] []
                            case Lib.$w$cput Data.Binary.Class.$fBinaryInt w_sfvK of _ {
                              (#,#) _ ww2_sfvO -> ww2_sfvO;
                            };
                  } in 
                    case
                        Data.ByteString.Builder.toLazyByteString sat_sfvP
                    of
                    sat_sfvQ
                    { __DEFAULT ->
                          case Data.ByteString.Lazy.toStrict sat_sfvQ of _ {
                            Data.ByteString.Internal.PS _ _ _ _ -> () [];
                          };
                    }; } in

Note that this is a local definition, an updateable thunk. So while the benchmark loop that runs this multiple times actually run this multiple times ...

                              case seq# [lvl3_sfvL w1_sfvY] of _ {
                                (#,#) ipv_sfw2 _ ->
                                    case -# [ww_sfvX 1#] of sat_sfw4 {
                                      __DEFAULT -> $wgo12_sfvW sat_sfw4 ipv_sfw2;
                                    };

... nothing really happens after the first iteration.

GHC is so good it makes very hard to benchmark Haskell programs sometimes.

This is Core of the same code:

    let {
      lvl3_sfuG :: ()
      lvl3_sfuG =
        case Data.ByteString.Lazy.toStrict
               (Data.ByteString.Builder.toLazyByteString
                  (case Lib.$w$cput
                          @ Int binary-0.8.3.0:Data.Binary.Class.$fBinaryInt w_sftD
                   of _ { (# ww1_ienD, ww2_ienE #) ->
                   ww2_ienE
                   }))
        of _
        { Data.ByteString.Internal.PS dt_ieuC dt1_ieuD dt2_ieuE dt3_ieuF ->
        ghc-prim-0.5.0.0:GHC.Tuple.()
        } } in

osa1 commented 8 years ago

I know how to fix this and I think we need to patch criterion to avoid this in the future... Patch will be coming this evening.

Thanks for reporting, btw.

haskell-perf / serialization

Serialization benchmarks are invalid #1