StationQ / Liquid

The Language-Integrated Quantum Operations (LIQUi|>) Simulator
http://StationQ.github.io/Liquid
Other
440 stars 97 forks source link

Memory Issues #57

Closed rumschuettel closed 6 years ago

rumschuettel commented 6 years ago

Hi there,

I'm having trouble running a simulation which uses more than 8gb of RAM; I often get an error Error: Garbage collector could not allocate 16384 bytes of memory for major heap section. on stderr, the output on stdout at that time was

0:0288.6/Time:   120 [0 to 240] MB=    6945 cache(55688,16) GC:580
0:0289.2/Time:   121 [0 to 240] MB=    6363 cache(55714,16) GC:1448
8:0293.2/... compiling MB=    7604 cache(55715,16) GC:3308
9:0294.0/... compiling MB=    7835 cache(55741,16) GC:1012
0:0294.3/Time:   122 [0 to 240] MB=    7747 cache(55766,16) GC:441

This happens on a server with 512gb of memory, at the time of error roughly 100 were still free.

On a "normal" PC with 16gb the same tends to fail with an error for which I attach a stack trace; on stdout I had

1:0045.7/... compiling MB=   11054 cache(2603,16) GC:2374
3:0046.2/... compiling MB=   11906 cache(2629,16) GC:1209
0:0046.4/Time:    94 [0 to 240] MB=   12131 cache(2654,16) GC:625
0:0046.5/Time:    95 [0 to 240] MB=   12239 cache(2654,16) GC:437

System is Ubuntu 14.04 (kernel 4.4.0-101-generic), and the latest Liquid.dll version (git branch db34962); due to a lack of a native mono runtime on these machines I used "mkbundle" to compile them to a standalone executable, but I'm assuming this is not the problem. Thanks for your help with this!

EDIT: error log attached. stderr.txt

EDIT2: if it helps, this is the mono runtime:

Mono JIT compiler version 5.4.1.6 (tarball Wed Nov  8 20:35:02 UTC 2017)
Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com
    TLS:           __thread
    SIGSEGV:       altstack
    Notifications: epoll
    Architecture:  amd64
    Disabled:      none
    Misc:          softdebug 
    LLVM:          supported, not enabled.
    GC:            sgen (concurrent by default)
dbwz8 commented 6 years ago

My guess (that's all I've got) is that your threads are being allocated with a really small stack (and possibly heap limitations as well). This has nothing to do with physical memory AFAICT. Quick search for the error message on the Internet turned up lots of people running into this (for other applications). For example: stackoverflow thread, mono GC docs. It looks like there was a bug fix last year for this. Here's the bug detail.

Since you're the only one who's reported it, I'm going to have to go with it being unique to your environment. There isn't really much I can do on this end. Best of luck solving it. If you find the right GC flags (if indeed that's it), please post back here so others can utilize your findings. I'll leave this open for a while in case you get more info or a fix.

rumschuettel commented 6 years ago

Thanks a lot Dave! I'm suspecting it's unique to the environment I'm on as well; unfortunately I cannot set mono flags on the target machine as mono is not installed there (thanks for the links, they were helpful! The problem is that mkbundle seems to not bundle the appropriate flags, so I'm hoping to get a newer mono version installed on the target machines and report back), but I'll keep trying to fix it in case it helps someone.