r-lyeh-archived / ltalloc

LightweighT Almost Lock-Less Oriented for C++ programs memory allocator
BSD 3-Clause "New" or "Revised" License
164 stars 17 forks source link

infinite loop in ltsqueeze #24

Closed gjaegy closed 5 years ago

gjaegy commented 6 years ago

Hi, not sure whether I am doing something wrong somewhere or not, but I am experiencing an issue when calling ltsqueeze().

Basically, the function never exits since the following for loop doesn't stop (first occurence, about line 1069 in the latest revision of the file): for (block = *pbatch; block; block = block->next)

In that case, block == block->next.

Do you have any idea what can be causing this issue ? I am not using the very latest version, but I don't think anything relevant to my issue has changed since (but I can be wrong).

Thanks !

r-lyeh commented 6 years ago

could for (block = *pbatch; block; block = block != block->next ? block->next : NULL ) help instead ?

gjaegy commented 6 years ago

Yes that would fix the problem I assume. However, my question is rather: what is going on, is it expected that block == block->next in some cases ? Isn't it a sign of something going wrong or a corruption ? Any idea ?

r-lyeh commented 6 years ago

Hmmm, likely. Maybe is it a buffer overflow in your code? Could you create a repro test case please?

gjaegy commented 6 years ago

No, I am affraid I can't easily reproduce the issue, what's quite annoying. It seems to be random, what's not suprising since the software is pretty complex and multithreaded... I think I will first put some check when "->next" is being set, to detect when it's set to the block pointer itself, maybe that would give me a clue...

gjaegy commented 5 years ago

It took some time but I managed to add some check routines in ltalloc, in order to detect when ->next is being set to the block pointer itself.

The routines successfully detects the anomaly, however, I am still trying to figure out why this occurs, and how to fix it.

The problem occurs in a container destructor (a eastl::vector using a custom allocator relying on ltalloc, which simply calls ltfree()).

Maybe you'll manage to figure out quickly what's going on ? Thanks for your help in any case !

ltalloc

r-lyeh commented 5 years ago

Hmmm... could it be a double ltfree() call? To check, I would use some kind of macro helper as following, to reveal the crash offender somewhere else:

// ltalloc2.h
#pragma once
#include <ltalloc.h>
#define ltfree(p) do { ltfree((p)); (p) = NULL; } while(0)
gjaegy commented 5 years ago

Possibly, I will check this. I don't understand how your macro works ? Pardon my ignorance :)

r-lyeh commented 5 years ago

include ltalloc2.h instead of ltalloc.h in your codebase. From this point, all pointers you ltfree() will be resetted always. This new behavior introduces two new features:

  1. Deleted pointers will crash now. Ie, if you used to do ltfree(p); p->function(); before.
  2. A second free will never happen again. Ie, you cannot do ltfree(p); ltfree(p); anymore.

Hopefully all together will recap some more info for the issue :)

gjaegy commented 5 years ago

OK understood, this simply sets the pointer to null after freed. I was trying to figure out the role of the do...while() loop...

You were right, there was a double call to ltfree() (detected through the sanity-check code I've added :

void CheckBlockNext(FreeBlock *fb)
{
    if (fb && fb->next == fb)
    {
        imAssert(imFALSE);
        imCoreLib::ConsoleDisplay(imT("ltalloc problem detected"), imLOGGERMESSAGELEVEL_ERROR);
    }
}

void CheckBlockNext(ChunkSm *fb)
{
    if (fb && fb->next == fb)
    {
        imAssert(imFALSE);
        imCoreLib::ConsoleDisplay(imT("ltalloc problem detected"), imLOGGERMESSAGELEVEL_ERROR);
    }
}

I call those function just after any ->next is being set. Could be helpful for other people as well maybe (as long as it can be disabled completely through some preprocessor definition) - of course you'll have to remove some custom types here (just for illustration).

Thanks a lot for putting me on the right track, very appreciated (as usual) 👍

r-lyeh commented 5 years ago

Np. I will think about adding memguard checks, thanks for the suggestion :D PS: That loop is explained here: https://stackoverflow.com/questions/257418/do-while-0-what-is-it-good-for

gjaegy commented 5 years ago

I see, because of the semi-colon, that trick is... interesting ;)

Yes, some optional memguard checks would definitively be a great addition to that very reliable library ! Let me know if I can help !

Cheers, Greg

gjaegy commented 5 years ago

Hi r-lyeh, I am again facing a weird issue, this time in release_thread_cache() call. Seems similar to my previous issue above (haven't investigated yet), with the following loop being stuck: while (tail->next)//search for end of list

I'll try to use the same methods and should be able to figure out the problem.

Still, this makes me think again that some kind of memguards checks and other optional validation layer could be very, very useful :)

cheers, Greg

gjaegy commented 5 years ago

Hi, me again. I haven't been able to figure out my memory corruption problem, so I am now looking to implement some kind of optional memory overflow detection. I assume the easiest way would be to add a canary at the end of the allocated blocks, and check its value in ltfree().

Any hint how I could do that ?

gjaegy commented 5 years ago

I think I've managed to have something working, I'll share the changes if that works.

r-lyeh commented 5 years ago

very cool, ty

gjaegy commented 5 years ago

Hi r-lyeh,

as promised, I have added some checks/validation in ltalloc.

The most important one being some optional buffer overrun detection using canary at the end of the allocated buffer. This feature can be enabled by defining the LTALLOC_OVERFLOW_DETECTION macro in your config.

The second one is less useful, but that's the one that had helped me to detect the double free issue I had (first post). It won't however detect all double frees, only consecutive ones, so it's a bit limited for now (but it caught my issue, so still useful). This feature can be enabled by defining the LTALLOC_NEXT_POINTER_CHECK macro.

Now, I've tried to use your coding styles but you are welcome to check and change the code if required !

I haven't clone the repository, so I am sending my changes as an updated ltalloc.cc file. Since that's a single file I assume it will be easy for you to review my changes and merge what you want (had to zip for GitHub to accept my attachment).

Let me know if I can further help !

ltalloc.zip

r-lyeh commented 5 years ago

hey @gjaegy,

your code seemed reasonable :D i've split & merged your code into two commits, one for the canary/overflow feature, and another one for source simplification. i've also added a minimal test for the canary as well.

ty!

gjaegy commented 5 years ago

Thanks a lot r-lyeh, that's great !