Closed jneira closed 5 years ago
Hi, i've collected some snapshots, made with jvisualvm
dhall.eta.tasty.profiles.zip
:
dhall.eta.tasty.nps
with output in the consoledhall.eta.tasty.pipe.to.file.nps
with the output redirected to a filedhall.eta.tasty.heap.nps
: memory profilingTaking a quick look:
eta.runtime.thunk.SelectorPUpd
So passing --hide-successes
to tasty execution makes the test suite run a way faster: https://circleci.com/gh/eta-lang/dhall-eta/75
Thanks for the observation. That surely means there's a memory leak to investigate here since holding on to less info made it run faster.
Yeah, although memory usage and gc overhead is similar between console and redirecting to file. Not sure if there is a memory leak cause once the process take the maximum memory possible the usage is pretty stable
I have a suspicion that this has to do with native memory allocation and not the heap memory. Can you check that as well? I think the MemoryManager isn't freeing as often or as well as it should leading to native heap growing endlessly.
In fact, not all executions to console are equal, i've taken another one and it was similar to the redirect one :thinking:
I've monitored native memory taking some samples as suggested in https://stackoverflow.com/a/30941584/49554:
Another file with native memory samples including time: native.txt
Hmm well I was wrong about that - it looks like the native memory usage increases very gradually and in amounts < 1MB. Btw you can view native memory in VisualVM by enabling the "VisualVM-BufferMonitor" plugin.
Wow, thanks for the tip
Another thing that might be interesting is if the JVM is taking the memory just because it can, or if it really needs it. You could test that by pressing the "Perform GC" button when it reaches the peak and check how much it drops.
Another helpful tool is Eclipse MAT. MAT operates on JVM memory dumps and you can do all sorts of analyses, e.g. find out which object types consume how much memory, find out by which instances another instance is referenced, etc.
@jneira If the largest number of classes you see is eta.runtime.thunk.SelectorPUpd
then this could mean it's an issue of the Eta runtime's lack of selector thunk optimization.
We probably need to implement this: https://github.com/typelead/eta/issues/517
A simple way to implement it is to spawn a thread when the runtime system initializes and just have it traverse the weak references to the selector thunks periodically to see if they can be reduced.
More details on how this leak occurs here: https://homepages.inf.ed.ac.uk/wadler/papers/leak/leak.ps.gz
@nightscape thanks for the tip! i am afraid that doing a gc does not free any significant memory so the 1500 Mb max seems to be needed
Some progress updates:
I've implemented a basic form of selector thunk optimization via StgContext
-local weak references. The solution doesn't involve multiple threads and automatically bounds the number of weak references created to avoid causing extra GC overhead. It appears showing better memory characteristics than before, but it can still be better. The next step is to short out thunk indirections to let go of even more memory.
I've been using this code to test the optimization (inspired by the Wadler paper):
import System.IO
import System.Directory
import Data.Function
main :: IO ()
main = do
let file = "hello"
file2 = "hello2"
contents <- readFile file
let insertb xs = before ++ "b" ++ after
where (before, after) = break (== 'b') xs
writeFile file2 (insertb contents)
removeFile file2
Where hello
is a file with a large number of characters other than 'b'
.
I'm also going to implement general thunk clearing that is thread-safe so that I can re-enable it by default. Without thunk clearing, severe space leaks can happen so it is absolutely essential that it be done. It can be enabled even now with -Deta.rts.clearThunks=true
and in fact I had to do so to even verify that the selector thunk optimization was working.
It will probably take a couple more days to implement what I mentioned above.
@jneira I've implemented both selector thunk optimization and re-enabled thunk clearing because it is now thread-safe (verified by running eta-benchmarks
which failed with spurious NPEs before b/c of thunk clearing and now runs smoothly).
Wait until the docker image for the current master
is ready and go ahead an re-run the CircleCI build for dhall-eta
and see how it fares.
In the bright side, the execution in local had a simply amazing improvement in both memory and time, fantastic work @rahulmutt:
So the main goal had been achieved!
But i am afraid the build in circleci hangs anyway so maybe it is caused for another reason. In my windows test the openBinaryFile: resource busy (file is locked)
persist.
I am going to close this one cause the memory allocation is resolved
When running tests or benchmarks eta executables the take excessive heap memory.
Description
Detected when running the
dhall-eta
test suite in windows and circleci (see https://github.com/typelead/eta/issues/915#issuecomment-448935276 )Expected Behavior
The proccess should use less memory (not sure about how many)
Actual Behavior
The process takes up to 2.5 Gb
Possible Fix
See @rahulmutt comment: https://github.com/typelead/eta/issues/915#issuecomment-448969337
Steps to Reproduce
dhall-eta
test suite in local or circleciContext
Setup test suite for dhall-eta
Your Environment