Closed ecaustin closed 8 years ago
I should note that this problem exists for all versions of GHC 7.10.x based on my limited testing, but is particularly bad for GHC 7.10.3 for whatever reason.
I can at least get compilation to finish with GHC 7.10.2, so I've rolled back to that version for now.
@roboguy13 David - is this something you could look at?
Urgh - I'm 99% sure this is an aeson
issue. For whatever reason, default implementations GHC generics-related functions don't inline very well, and if you force them to be inlined via INLINE
pragmas, they can result in insane memory usage. pandoc-types
has also experienced a similar issue, and their workaround was to roll back from aeson-0.10
to aeson-0.9
.
I don't know the root cause of the issue, but I speculate that if they removed all of the INLINE
pragmas from Data.Aeson.Types.Generics
' generic class methods, they might not have such a bad time. Unfortunately, @bos is pretty hard to get ahold of, so I don't know how soon this issue will be addressed...
The aeson dep in this cabal file already limits to <0.10. Maybe try forcing back to 0.8? Could also just try splitting up that module into one-per-datatype and see if that helps.
Workarounds, of course, but might be easier than writing instances by hand.
I can confirm that I am using aeson-0.10 at this point. More specifically, it's the master branch from github because I was tired of running into this warning: https://github.com/bos/aeson/issues/290
Shame on me for changing the cabal file, but it seemed like the easier solution than manually inlining the default implementation everywhere.
That being said, I'm pretty sure I was having the same problems with aeson-0.9.0.1, but I'd be happy to retest using that and earlier versions.
Maximum memory residency during compilation with GHC-7.10.2 and aeson-0.8.1.1was ~1.4GB on my machine which which is only marginally less than with aeson-0.10.
Once Drew wraps up porting HERMIT to GHC 8, I'll repeat the tests with it.
My hope is that memory usage involving GHC generics decreases with GHC 8.0, since it was redesigned to encode the generic representation with -XDataKinds
rather than generating a jillion empty datatypes an proxy instances.
@andygill I should be able to look into it this weekend. I don't know too much about the details of this aspect of GHC compilation, but I'll see what I can do.
We just need a workaround.
I've been more rigorously testing this all day and can confirm it's an issue that is amplified by aeson-0.10.
The maximum memory residency for any combination of GHC-7.10.x and aeson-0.8 or 0.9 is under 2GB which is obviously not ideal, but it's good enough for me to continue with my work.
I'll apologize for screaming fire here since, as @xich pointed out, the hermit-shell.cabal file already disallows aeson-0.10.
The maximum memory residency for any combination of GHC-7.10.x and aeson-0.8 or 0.9 is under 2GB which is obviously not ideal, but it's good enough for me to continue with my work.
That's somewhat encouraging to hear, since aeson-0.8
and -0.9
both have substantially simpler Data.Aeson.Types.Generic
modules than -0.10
. Perhaps that's worth investigating some more.
@roboguy13, can you try this:
aeson-0.10
as a control, but also prepare a version that removes all INLINE
pragmas from Data.Aeson.Types.Generic
pandoc-types
.Someone reported using 7GB of memory when compiling pandoc-types
with aeson-0.10
, so if removing the INLINE
pragmas makes it better, it's probably worth opening a pull request for.
@RyanGlScott Do you happen to know if the GHC developers are aware of these things yet? This might make a good test case for them (although it might be an unavoidable issue where the solution is just fewer INLINE
s).
I believe what we're running into is an instance of this bug: https://ghc.haskell.org/trac/ghc/ticket/9630
@roboguy13, I believe that's precisely the issue. Aside from the aforementioned inlining issues, a comment on Trac #9630 has another nifty hack that might help speed things up: make each generic class have exactly one method. Apparently, if you have multiple class methods like this (taken from Data.Aeson.Types.Class
):
class GToJSON f where
gToJSON :: Options -> f a -> Value
gToEncoding :: Options -> f a -> Encoding
it hurts the optimizer a lot. Try taking all of the generic classes in Data.Aeson.Types.Class
and Data.Aeson.Types.Generic
and splitting them up, e.g.,
class GToJSON f where
gToJSON :: Options -> f a -> Value
class GToEncoding g where
gToEncoding :: Options -> f a -> Encoding
To see if that improves things.
I see that stackage is returning to aeson-0.9. Is this for the same reason as the HERMIT issue (slow build) or because of the Null problem, or both?
Huh, that's a surprisingly difficult question to answer. Neither this nor this elucidate much on what reasons in particular they chose to hold it back (other than mutterings of "issues" and "regressions"). It's probably safe to assume the Null
issue and compilation performance are paramount among those regressions, though, so I still think it's worth pursuing this.
I'm assuming both, in addition to some other bugs and regressions that aeson-0.10 introduced. There were also a large number of packages that were bumped from LTS because they either could not, or refused, to use aeson-0.10.
On a related note, have we tested the Template Haskell derivation mechanism for aeson to see if that would be an acceptable short-term work-around?: https://hackage.haskell.org/package/aeson-0.10.0.0/docs/Data-Aeson-TH.html
A quick grep
shows that we're Template Haskell free as of right now, so I wasn't sure if introducing it to HERMIT would be an acceptable cost for potentially faster compilation, even with aeson-0.9.
I was looking at another package (pandoc-types
) that had compilation slowdowns, so I forked it and replaced all generically derived FromJSON
/ToJSON
instances with Template Haskell–derived instances. The results are encouraging. On my 64-bit Linux laptop with 4 GB of RAM, I compiled both versions of pandoc-types
with aeson-0.10
:
pandoc-types
(no changes):$ /usr/bin/time -v cabal install pandoc-types
Resolving dependencies...
Downloading pandoc-types-1.16.0.1...
Configuring pandoc-types-1.16.0.1...
Building pandoc-types-1.16.0.1...
Installed pandoc-types-1.16.0.1
Command being timed: "cabal install pandoc-types"
User time (seconds): 251.96
System time (seconds): 5.90
Percent of CPU this job got: 41%
Elapsed (wall clock) time (h:mm:ss or m:ss): 10:20.85
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 3048536
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 54055
Minor (reclaiming a frame) page faults: 1332223
Voluntary context switches: 61491
Involuntary context switches: 15193
Swaps: 0
File system inputs: 3456696
File system outputs: 123552
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
pandoc-types
(using Template Haskell instead):$ /usr/bin/time -v cabal install .
Resolving dependencies...
Configuring pandoc-types-1.16.1...
Building pandoc-types-1.16.1...
Installed pandoc-types-1.16.1
Command being timed: "cabal install ."
User time (seconds): 59.07
System time (seconds): 1.17
Percent of CPU this job got: 93%
Elapsed (wall clock) time (h:mm:ss or m:ss): 1:04.23
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 756052
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 20
Minor (reclaiming a frame) page faults: 410406
Voluntary context switches: 2605
Involuntary context switches: 3242
Swaps: 0
File system inputs: 34984
File system outputs: 118000
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
tl;dr Compilation time went from over 10 minutes to 1 minute, and went from using 3 GB of RAM (and thrashing my laptop mercilessly) to less than 1 GB after switching to TH.
Did you measure just compilation performance, or run-time performance too? Does reducing inlining affect encoding/decoding speed at all?
I just ran a benchmark from the aeson
repo that tests the performance of the generic deriving mechanism and posted the results here. From what I can tell, my changes don't appreciably affect the runtime performance.
Good news: aeson-0.11
was just released, which fixes this issue! I just compiled hermit-shell
and HERMIT.RemoteShell.Orphange
compiled in a manner of seconds without eating up too much memory. I've adjusted hermit-shell.cabal
to allow aeson-0.11
(but to disallow aeson-0.10
, so that Travis won't try to pick it while we wait for hermit-shell
's dependencies to upgrade to aeson-0.11
as well).
Great! I was just about to push remote-json to hackage. I'll make sure it all works with aeson-0.11, and push it. Should make building hermit a bit easier. Do you know the status of KURE?
HERMIT still needs to be overhauled to use kure-3
(and I say overhauled because it will involve a significant amount of refactoring). Sadly, I don't think I'll have much time in the forseeable future to work on it.
Wow, it sounds like aeson-0.11 is a major improvement over even aeson-0.9! Awesome.
There is a massive increase in the amount of memory required to compile hermit-shell with GHC 7.10.3, to the point that it is crashing the VM I'm using.
The memory usage skyrockets when compiling the
HERMIT.RemoteShell.Orphanage
module which leads me to believe that the problem may be related to this GHC/binary bug: https://ghc.haskell.org/trac/ghc/ticket/9630Does anyone have time to confirm this and possibly replace some of the generic instances with manually written ones as a short term fix?
I'd volunteer to do it myself, but my hands are tied for the time being.