DanielG / ghc-mod

Happy Haskell Hacking for editors. DEPRECATED
Other
677 stars 175 forks source link

High memory consumption being compiled with GHC 8 #834

Open vasily-kirichenko opened 7 years ago

vasily-kirichenko commented 7 years ago

Editing any file in this repo https://github.com/vasily-kirichenko/haskell-book (for example, this one https://github.com/vasily-kirichenko/haskell-book/blob/master/src/SemigroupsAndMonoids.hs) in Atom (haskell-ide / repl / autocomplete / hasktags / pointful / pointfree plugins) causes ghc-mod process to occupy memory at very high speed. It's ~1.4GiB after a few editions and ~10GiB after a few minutes of editing (after which I kill ghc-mod and it's restarted).

I switched to Stackage 7.0 yesterday and did stack install ghc-mod / stylish-haskell / hlint etc., I believe the leak started to appear after this upgrade, there were no memory problems on Stackage 6.17 - hours of happy work.

Ubuntu 16.04.1 x64, Atom 1.10.0

$ stack repl
...
GHCi, version 8.0.1: http://www.haskell.org/ghc/  :? for help
...
$ ghc-mod --version
ghc-mod version 5.6.0.0 compiled by GHC 8.0.1
vasily-kirichenko commented 7 years ago

I've confirmed that the issue is caused by Stackage 7.0. I switched to 6.17, stack install ghc-mod, after which ghc-mod reached only ~300MiB of memory after 10 minutes of editing.

DanielG commented 7 years ago

That file you link to doesn't seem to exist, do you mean src/SemigroupsAndMonoids.hs?

vasily-kirichenko commented 7 years ago

Yes. Fixed, sorry.

DanielG commented 7 years ago

@lierdakil can you try to reproduce this? I don't see any abnormal memory usage with just ghc-mod legacy-interactive.

lierdakil commented 7 years ago

So... I was able to make memory consumption grow consistently by repeatedly requesting type on literally each character with ghc-mod-5.6 and both ghc-7.10.3 and ghc-8.0.1. It's nowhere near reported levels though. Still it's a fact that memory consumption does consistently grow, so I would argue there's a space leak going on somewhere and it doesn't seem like GC is kicking in at any point. I also noticed that ghc-mod is considerably slower with ghc-8.

Besides, with ghc-8 ghc-mod consumes roughly 5-10 times more memory, but I think that ghc-8.0.1 is known for its memory consumption, so it's not necessarily a problem with ghc-mod itself.

DanielG commented 7 years ago

I think I finally found what is leaking space all over the place, see below.

Basically I added a forever (threadDelay 1000000) to the end of main in src/GHCMod.hs this allows everything except this space leak to be garbage collected. Then using a retainers profile +RTS -hr i can see that FastString.<CAF> is holding on to all the remaining memory after we enter that loop. The only thing in that module that looks like it could grow to the sizes I'm seeing is string_table so that's probably the culprit.

First I though that getOrSetLibHSghcFastStringTable looks suspicious but it seems like it doesn't impact GC at all. I've tried re-initializing it with initGlobalStore and that doesn't change anything. I've also tried getting rid of that memory using revertCAFs but that doesn't work, not sure why. Next step would be to try adding a function to GHC to empty out that cache. If that doesn't work either I probably misidentified the culprit :/

{-
Internally, the compiler will maintain a fast string symbol table, providing
sharing and fast comparison. Creation of new @FastString@s then covertly does a
lookup, re-using the @FastString@ if there was a hit.

The design of the FastString hash table allows for lockless concurrent reads
and updates to multiple buckets with low synchronization overhead.

See Note [Updating the FastString table] on how it's updated.
-}
data FastStringTable =
 FastStringTable
    {-# UNPACK #-} !(IORef Int)  -- the unique ID counter shared with all buckets
    (MutableArray# RealWorld (IORef [FastString])) -- the array of mutable buckets

string_table :: FastStringTable
{-# NOINLINE string_table #-}
string_table = unsafePerformIO $ do
  uid <- newIORef 603979776 -- ord '$' * 0x01000000
  tab <- IO $ \s1# -> case newArray# hASH_TBL_SIZE_UNBOXED (panic "string_table") s1# of
                          (# s2#, arr# #) ->
                              (# s2#, FastStringTable uid arr# #)
  forM_ [0.. hASH_TBL_SIZE-1] $ \i -> do
     bucket <- newIORef []
     updTbl tab i bucket

  -- use the support wired into the RTS to share this CAF among all images of
  -- libHSghc
#if STAGE < 2
  return tab
#else
  sharedCAF tab getOrSetLibHSghcFastStringTable

-- from the RTS; thus we cannot use this mechanism when STAGE<2; the previous
-- RTS might not have this symbol
foreign import ccall unsafe "getOrSetLibHSghcFastStringTable"
  getOrSetLibHSghcFastStringTable :: Ptr a -> IO (Ptr a)
#endif
DanielG commented 7 years ago

Filed a GHC bug https://ghc.haskell.org/trac/ghc/ticket/13110.

ryukinix commented 7 years ago

Any known workaround about that?

maelvls commented 6 years ago

The open ticket is stale; any news?

DanielG commented 6 years ago

Nope. Trying to reproduce the problem with GHC 8.2 would be helpful though, and it would give the GHC guys a nudge too ;)

DanielG commented 6 years ago

If you are affected by this bug feel free to say so in the GHC ticket. Knowing how many users are actually affected by a given issue is hard and that helps.

maelvls commented 6 years ago

That you for the advice!! I should have though of that!