HaxeFoundation / hxcpp

Runtime files for c++ backend for haxe
Other
294 stars 187 forks source link

Garbage collector consuming too much memory and crashing after 2 GB on Windows8 64-bit #319

Closed ex closed 8 years ago

ex commented 8 years ago

I created this stress test case for issue #303: https://github.com/ex/haxe/tree/master/gc However this is a different issue so I'm reposting it here.

I checked with the latest changes and I still can see the memory consumption going up to 2 GB with 400000 objects (when reading from XML) and crashing. Without reading XMLs the objects consume only 202 MB (on Windows ).

The dump traces the crash to the GC. One issue is the excessive memory consumption but the crash is happening due to malloc failing to reclaim more memory from the system, so I don't know if something is needed to support bigger memory addresses. My project is likely to need more that 2 GB of memory to run.

void *AllocLarge(int inSize, bool inClear)
   {
      if (!result)
      {
         #ifdef SHOW_MEM_EVENTS
         GCLOG("Large alloc failed - forcing collect\n");
         #endif

         CollectFromThisThread();
         result = (unsigned int *)malloc(inSize + sizeof(int)*2); <<== this fails
      }
      result[0] = inSize; <<== CRASH because pointer is null
}

I see an improvement in the current version because there was an additional slowdown with hxcpp 3.2.180 that I'm not seeing right now.

ex commented 8 years ago

I added a Visual Studio 2015 project to help debugging (it needs the Haxe build to run first) I could not get it to compile without using HXCPP_M64 and the demo is not crashing on 2GB; however it gets slower and slower the more memory it reclaims. I'm going to see if I can find something that can help.

ex commented 8 years ago

Testing with HXCPP_GC_MOVING this demo makes the GC use less memory. It is now using near 1GB instead of 2GB and the program ends faster. Unfortunately activating this flag in my project crashes it. [EDIT] I updated to the latest hxcpp, there was a fix in the Hash class that fixed the crash.

tanis2000 commented 8 years ago

I've been experiencing this same behavior on iOS. Memory consumption keeps growing until the Springboard kills the app. I traced it down to local variables not being garbage collected. It looks like the object is being correctly instantiated through the new() function but when the function that contains it end, the local variable isn't being deallocTed. It's just like it weren't marked to be garbage collected. This happens with the current development version in the repo.

hughsando commented 8 years ago

The stack marking is conservative, which means that you can not guarantee 100% that a local variable will not get marked. But it should only be a couple of variables - not 10s or 100s or in a loop, unless of course the local variable refers an array or tree of other values. Also, if the local variable refers to other resources, like bitmap data, the GC system might need a bit of help to dispose of this memory properly. The issue with the xml files is that the pattern of allocations causes memory fragmentation. Since some of the xml data are retained (eg, the name field) objects allocates near this field that are not actually be retained(marked) will still not be reclaimed due to the internal structure. You can turn on the SHOW_FRAGMENTATION define in Immix.cpp to see how this issue affects your particular usage. GC_MOVING allows this problem to be solved - however I see you are having problems with this. It would be nice to get to the bottom of the crashes you are having.

On Tue, Nov 3, 2015 at 5:32 AM, Valerio Santinelli <notifications@github.com

wrote:

I've been experiencing this same behavior on iOS. Memory consumption keeps growing until the Springboard kills the app. I traced it down to local variables not being garbage collected. It looks like the object is being correctly instantiated through the new() function but when the function that contains it end, the local variable isn't being deallocTed. It's just like it weren't marked to be garbage collected. This happens with the current development version in the repo.

— Reply to this email directly or view it on GitHub https://github.com/HaxeFoundation/hxcpp/issues/319#issuecomment-153163675 .

tanis2000 commented 8 years ago

My problem is pretty simple to replicate. I have a function that is being called in a render loop, thus being called a lot of times per frame.

    public function drawImage( img:Texture, x:Float, y:Float )
    {
        var xw:Float = x + img.width;
        var yh:Float = y + img.height;
        var p1 = new Vector2( x, yh );
        var p2 = new Vector2( x, y );
        var p3 = new Vector2( xw, y );
        var p4 = new Vector2( xw, yh );

        if( !cullRectangle.overlapsAABBFromPoints( p1, p2, p3, p4 ) )
        {
            culledObjectCount++;
            return;
        }

        geometryRenderer.end();
        textRenderer.end();
        imageRenderer.drawImage( img, p1.x, p1.y, p2.x, p2.y, p3.x, p3.y, p4.x, p4.y, color );
    }

p1, p2, p3, p4 are the variables that aren't being deallocated when the function exits

Vector2 is defined like this:

class Vector2
{
    public var x:Float;
    public var y:Float;
    public var length (get,set):Float;

    public inline function new( x:Float = 0, y:Float = 0 ):Void
    {
        this.x = x;
        this.y = y;
    }
}
thomasuster commented 8 years ago

@tanis2000 If you create a sample project in github that has instructions on how to build I can take a look. I likely won't be able to fix it, but I might be able to help narrow it down for Hugh when he's able to take a look.

tanis2000 commented 8 years ago

There you go: https://github.com/tanis2000/bug-hxcpp-gc

Make sure that you have Lime installed. I've been using the latest hxcpp that you suggested me @thomasuster

Run it on a real device. More info in the README.

Cheers!

hughsando commented 8 years ago

I think this should be fixed by https://github.com/HaxeFoundation/hxcpp/commit/5ea050c044a38d27589fac76cb5d57beff9333c5

Which is in the git version, but post 3.2.188. I have been out-of-action for the last couple of weeks, but I will kick the build-server and make sure it gets a new version on nmehost.

tanis2000 commented 8 years ago

I've just got the version in the git (master branch) and deleted my iOS project and rebuilt it from scratch.

The result is that it works fine. You should definitely release a new version. Thanks for the support!

madrazo commented 8 years ago

Hi Hugh.

I have been testing the GC_MOVING flag (Windows 10, VS2015) and https://github.com/HaxeFoundation/hxcpp/commit/154113b0ec793d112c296391bed9b9e6cc4edc21 solves crashing when not targeting 64-bit (no HXCPP_M64 flag). But, when I try to use HXCPP_M64 it crashes.

The last line from stack I get: Immix.cpp line 1425 HX_MARK_STRING(str);

str = 0x3ff663a95d359648 Error reading characters of string. or str = 0x3ff0000000000000 Error reading characters of string.

Thanks

Thanks

hughsando commented 8 years ago

I am still trying to find all the kinks in the moving GC. Do you have a reference to a test app that shows the problem?

madrazo commented 8 years ago

Oh I see the problem now. I have a class variable Array and then local variables "Int" with the same name. In one function is declared and in other is received as function parameter. Changing variable names solves the crash. As I mentioned, the issue only occurs with HXCPP_M64 flag.

Now I have a crash in different place. Immix.cpp line 1238 obj->__Mark(this);

this = 0x0000008edbedfcb8 {mPos=0 mInfo=0x0000008edb18dbb0 {mClass=0x0000000000000000 < NULL > mMember=0x0000000000000000 < NULL > } ...}

This one is trickier as I don't see a relation with the Haxe class on the stack but I will try to insolate it. Thank you very much

madrazo commented 8 years ago

Reserving GC memory beforehand seems to solve it

hughsando commented 8 years ago

The GC should be able to handle these cases, so if you have small test case, I would like to have a look at it.

On Thu, Nov 5, 2015 at 11:06 PM, Carlos Madrazo notifications@github.com wrote:

Reserving GC memory beforehand seems to solve it

— Reply to this email directly or view it on GitHub https://github.com/HaxeFoundation/hxcpp/issues/319#issuecomment-154086763 .

madrazo commented 8 years ago

Calling cpp.vm.Gc.compact() makes it crash but it not may be needed anyway. On 32-bit also makes it crash sometimes.

Yes, I will try to prepare small test case. Thank you very much

Tiago-Ling commented 8 years ago

Any news on this? I tried to test my application with HXCPP_GC_MOVING but it crashes, even on 32 bits (i suppose the default windows build to be 32 bits). This crash does not return a stack trace, because trying to build using neko build.n windows -debug results in this error:

C:/GitRepos/hxcpp/project/libs/regexp/../../thirdparty/pcre-7.8/pcre_study.c : fatal error C1041: cannot open program database 'c:\gitrepos\hxcpp\project\libs\regexp\obj\msvc1864-stat-debug\vc.pdb'; if multiple CL.EXE write to the same .PDB file, please use /FS

#### Error building neko -Dwindows -DHXCPP_M64 -Dstatic_link -Ddebug

By the looks of the error it appears to be trying to build for 64 bits but even if i use neko build.n windows-m32 -debug the result is the same, only changing the flag to -DHXCPP_M32 instead.

I'm on Windows 10 x64 using Haxe 3.2.0, Hxcpp commit 437b350 OpenFL 3.3.8 and Lime 2.6.8.

Thanks!

EDIT: I solved the crash, it was a dynamic variable access in one of the libraries i'm using.

hughsando commented 8 years ago

The compile error should be fixed in the github version of hxcpp. You will need to rebuild the tool from tools/hxcpp to see the effect of the change.

On Wed, Nov 11, 2015 at 2:52 AM, Tiago Ling Alexandre < notifications@github.com> wrote:

Any news on this? I tried to test my application with HXCPP_GC_MOVING but it crashes, even on 32 bits (i suppose the default windows build to be 32 bits). This crash does not return a stack trace, because trying to build using neko build.n windows -debug results in this error:

C:/GitRepos/hxcpp/project/libs/regexp/../../thirdparty/pcre-7.8/pcre_study.c : fatal error C1041: cannot open program database 'c:\gitrepos\hxcpp\project\libs\regexp\obj\msvc1864-stat-debug\vc.pdb'; if multiple CL.EXE write to the same .PDB file, please use /FS

Error building neko -Dwindows -DHXCPP_M64 -Dstatic_link -Ddebug

By the looks of the error it appears to be trying to build for 64 bits but even if i use neko build.n windows-m32 -debug the result is the same, only changing the flag to -DHXCPP_M32 instead.

I'm on Windows 10 x64 using Haxe 3.2.0, Hxcpp commit 437b350 https://github.com/HaxeFoundation/hxcpp/commit/437b35035de4f2578eadc289700f91577a4c131c OpenFL 3.3.8 and Lime 2.6.8.

Thanks!

— Reply to this email directly or view it on GitHub https://github.com/HaxeFoundation/hxcpp/issues/319#issuecomment-155529470 .

ex commented 8 years ago

Using the latest version and HXCPP_GC_MOVING seemed to close this bug in our ongoing project, but today we hit a crash in the garbage collector testing GC.compact() and GC.run(). We are aware of the telemetry crash using hxScout #333 but we were not using telemetry this time. It seems that there are still issues with the GC code. If it helps, most of these crashes are related with with the Hash class:

    [Frames below may be incorrect and/or missing, no symbols loaded for ntdll.dll] 
    lime-legacy.ndll!0fb7f4ab() Unknown
>   MainApplication.exe!hx::InternalRealloc(void * inData, int inSize) Line 3955    C++
    MainApplication.exe!hx::Hash<hx::TIntElement<Dynamic> >::rebucket(int inNewCount) Line 255  C++
    MainApplication.exe!hx::InternalNew(int inSize, bool inIsObject) Line 3916  C++
    MainApplication.exe!hx::Hash<hx::TIntElement<String> >::allocElement() Line 338 C++
    MainApplication.exe!hx::Hash<hx::TIntElement<String> >::TSet<String>(int inKey, const String & inValue) Line 429    C++

Unfortunately I can not reproduce this crash on my simple test case.

hughsando commented 8 years ago

This issue has drifted from its original purpose, so I'm closing it. Please open a separate issue if you are seeing crashes with GC_MOVING