haskell / aeson

A fast Haskell JSON library
Other
1.26k stars 321 forks source link

`Generic` instance with `omitNothingFields = True` is very slow #1030

Open brprice opened 1 year ago

brprice commented 1 year ago

There seems to be exponential time complexity with genericToEncoding defaultOptions {omitNothingFields = True}, although this seems to depend on the ghc version used. Consider the following code (which has been somewhat minimised, but still depends on a fair bit of aeson)

{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE DerivingStrategies #-}

module Main (main) where

import Data.Aeson.Types.ToJSON (
  ToJSON(toEncoding,toJSON),
  genericToEncoding,
 )
import Data.Aeson.Types.Internal (
  defaultOptions,
  omitNothingFields,
 )
import Data.Aeson.Encoding.Internal (encodingToLazyByteString)
import GHC.Generics (Generic)
import System.Environment (getArgs)

import qualified Data.ByteString.Lazy as L

data Tree
  = Node () [Tree]
  deriving stock (Show, Generic)

instance ToJSON Tree where
  toEncoding = genericToEncoding defaultOptions {omitNothingFields = True}

tree' :: Int -> Tree
tree' 0 = Node () []
tree' n = Node () [tree' $ n-1]

main :: IO ()
main = do
  [n'] <- getArgs
  let n = read n'
  putStrLn "about to print tree"
  let tree = tree' n
  print $ tree
  putStrLn "about to ToJSON tree"
  print $ toJSON tree
  putStrLn "about to encode tree"
  print $ encode tree
  putStrLn "done"

---- From aeson

encode :: (ToJSON a) => a -> L.ByteString
encode = encodingToLazyByteString . toEncoding
I get results such as the following with ghc 9.2.7 Command (with ghc 9.2.7) Mean [ms] Min [ms] Max [ms] Relative
cabal run -O1 aeson-slow -- 1 53.5 ± 0.1 53.4 53.9 1.00 ± 0.00
cabal run -O1 aeson-slow -- 2 53.5 ± 0.0 53.4 53.6 1.00 ± 0.00
cabal run -O1 aeson-slow -- 3 53.5 ± 0.1 53.4 53.9 1.00 ± 0.00
cabal run -O1 aeson-slow -- 4 53.5 ± 0.0 53.4 53.6 1.00 ± 0.00
cabal run -O1 aeson-slow -- 5 53.5 ± 0.1 53.4 53.9 1.00 ± 0.00
cabal run -O1 aeson-slow -- 6 53.5 ± 0.0 53.3 53.6 1.00 ± 0.00
cabal run -O1 aeson-slow -- 7 53.5 ± 0.1 53.3 54.0 1.00 ± 0.00
cabal run -O1 aeson-slow -- 8 53.5 ± 0.1 53.3 54.0 1.00 ± 0.00
cabal run -O1 aeson-slow -- 9 53.4 ± 0.1 53.3 53.6 1.00
cabal run -O1 aeson-slow -- 10 54.0 ± 2.4 53.2 64.0 1.01 ± 0.05
cabal run -O1 aeson-slow -- 11 63.6 ± 0.2 63.4 64.3 1.19 ± 0.00
cabal run -O1 aeson-slow -- 12 63.6 ± 0.2 63.3 64.4 1.19 ± 0.00
cabal run -O1 aeson-slow -- 13 63.6 ± 0.2 63.4 64.4 1.19 ± 0.00
cabal run -O1 aeson-slow -- 14 71.5 ± 4.2 63.2 74.9 1.34 ± 0.08
cabal run -O1 aeson-slow -- 15 82.0 ± 3.7 73.2 84.0 1.53 ± 0.07
cabal run -O1 aeson-slow -- 16 103.5 ± 0.1 103.4 103.9 1.94 ± 0.00
cabal run -O1 aeson-slow -- 17 143.4 ± 0.0 143.4 143.5 2.68 ± 0.00
cabal run -O1 aeson-slow -- 18 226.5 ± 4.8 223.2 233.6 4.24 ± 0.09
cabal run -O1 aeson-slow -- 19 391.6 ± 15.6 383.4 434.0 7.33 ± 0.29
cabal run -O1 aeson-slow -- 20 722.9 ± 12.0 713.9 753.9 13.53 ± 0.22
cabal run -O1 aeson-slow -- 21 1382.9 ± 14.5 1373.5 1413.9 25.87 ± 0.27
cabal run -O1 aeson-slow -- 22 2720.9 ± 40.5 2674.2 2774.0 50.91 ± 0.76
cabal run -O1 aeson-slow -- 23 5379.8 ± 49.2 5304.0 5464.0 100.65 ± 0.93
cabal run -O1 aeson-slow -- 24 10575.9 ± 32.6 10523.8 10613.9 197.87 ± 0.68
cabal run -O1 aeson-slow -- 25 21091.9 ± 54.6 21033.8 21203.5 394.61 ± 1.18
cabal run -O1 aeson-slow -- 26 42325.9 ± 264.1 41984.1 42673.9 791.89 ± 5.08
Note that this seems rather dependent on the specific ghc version used -- I don't know whether this implies that there is a ghc regression/performance fix and this is really an upstream issue, or whether it means that aeson relies on some optimization firing and that this can be unreliable (and if it is this second option, I don't know if it is possible to make it more reliable). Running a sweep over different ghc versions I see the following GHC version Mean [ms] Min [ms] Max [ms] Relative
GHC 8.10.7 1.0 ± 0.2 1.0 3.8 1.03 ± 0.19
GHC 9.0.2 1.0 ± 0.0 1.0 1.5 1.00
GHC 9.2.4 40564.2 ± 125.1 40362.9 40783.3 40655.26 ± 1458.36
GHC 9.2.5 39164.4 ± 86.0 38997.4 39267.6 39252.38 ± 1405.47
GHC 9.2.6 39244.7 ± 81.5 39111.9 39365.8 39332.87 ± 1408.07
GHC 9.2.7 38837.3 ± 141.0 38603.0 39147.1 38924.50 ± 1398.27
GHC 9.4.2 43628.6 ± 752.3 42621.7 44991.6 43726.59 ± 1735.12
GHC 9.4.3 1272.6 ± 12.0 1261.6 1291.6 1275.50 ± 47.14
GHC 9.4.4 12.2 ± 0.1 11.8 12.4 12.18 ± 0.45
GHC 9.4.5 12.2 ± 0.2 11.7 12.6 12.19 ± 0.46
phadej commented 1 year ago

I can reproduce this. The recent changes in master (and upcoming 2.2.0.0) also didn't made this go away.