Reload from file mem leak

johniez commented 4 years ago

Hi,

I am currently testing code very similar to example/ReloadFromFileTrie.c. I've got lot of threads reading properties value and the main thread reloading the data occasionally. I am using Linux and threading is not disabled by defining FIFTYONEDEGREES_NO_THREADING.

When I use the valgrind memchecker, it says there is a leak on original dataset (from the very first initialization). I've tracked it to line 3040 where active dataset is already switched by the reloading thread, but last ("oldActive") user does not free the data. Problem could be, that FIFTYONEDEGREES_INTERLOCK_DEC macro target (__sync_fetch_and_add(v, -1)) returns previous value of passed variable, so for the last user of the dataSet it will be 1 == 0, thus this dataset won't be freed at all.

When I changed that condition to if (FIFTYONEDEGREES_INTERLOCK_DEC(&offsets->active->inUse) == 1 && ..., mem leak is gone. Unfortunatelly another problem arises in highly concurrent environment (lot of dataset reader threads and often data reloading). When dataset is being read while another thread reloads the dataset, while at line 2989 goes another iteration as provider active dataset has changed. So the line 2993 executes and goes to free dataset as well. And segmentation fault occurs.

So this "mem leak fix" is making it worse, further investigation is still necessary...

Another suspicious code I saw was that if reader reads active data set, and just before it will increment inUse variable, another thread making reload switched active dataset and as the reader yet did not incremented inUse value, line 1791 will pass and free the data. Reader can read some deleted data, until line 3009, where it will try to switch to active dataset again (and maybe it will make a second free() call through line 2993). Do not know yet, but could be relevant to behaviour I saw with the sigsegv...

ben51degrees commented 4 years ago

Hi @johniez,

Thanks for bringing this up. I will be looking into this today and keep you updated.

ben51degrees commented 4 years ago

Hi @johniez, I have now pushed the fix for this. You are correct that __sync_fetch_and_add returns the value before it was altered. This has now been changed to __sync_add_and_fetch which matches _InterlockedDecrement (and the intended usage).

The issue with your change was that while it works in that method, there are other checks for inUse == 0 which will be affected. Just for peace of mind, do you mind confirming that the segfault does not occur after applying this fix?

Another suspicious code I saw was that if reader reads active data set, ...

In this scenario, it does seem there is a potential problem. We have never seen an exception when running with 50 concurrent threads and as many reloads as possible, so if it is an issue it will be seen very rarely. I will look at this and either confirm it's OK, or push a fix. I think either way, it would benefit from having the return value of FIFTYONEDEGREES_INTERLOCK_INC checked.

johniez commented 4 years ago

Hi @ben51degrees, thanks for quick answer and for the fix. Unfortunately I am still encountering the segfault.

Valgrind outputs:

==16528== Invalid read of size 8
==16528==    at 0x4E3D36D: fiftyoneDegreesSetDeviceOffsetFromArrayWithTolerances (51Degrees.c:2840)
==16528==    by 0x4E3D487: fiftyoneDegreesSetDeviceOffsetWithTolerances (51Degrees.c:2877)
...
==16528==  Address 0x60b4e08 is 136 bytes inside a block of size 232 free'd
==16528==    at 0x4C2CDDB: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==16528==    by 0x4E3C972: fiftyoneDegreesActiveDataSetFree (51Degrees.c:1743)
==16528==    by 0x4E3D601: fiftyoneDegreesProviderFreeDeviceOffsets (51Degrees.c:3042)
==16528==    by 0x4E3D627: fiftyoneDegreesProviderCreateDeviceOffsets (51Degrees.c:2993)
...
==16528==  Block was alloc'd at
==16528==    at 0x4C2BBAF: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==16528==    by 0x4E3C830: fiftyoneDegreesInitProviderWithPropertyString (51Degrees.c:1673)

Where ... is just start_thread and traceback in my testing program...

johniez commented 4 years ago

When I modified the test that I am locking all the threads reading properties value while reloading the data set, it is running without an error.

johniez commented 4 years ago

I commited my testing code into https://github.com/johniez/Device-Detection/tree/segfault/lib

ben51degrees commented 4 years ago

I have now committed a change which I'm confident will solve this.

Previously, there was a very small chance that two threads could free the same data set. This relied on one thread doing quite a lot happening in a separate thread between inUse being decremented and the offsets being freed.

I forced this to happen with a strategically placed Sleep(100) to allow another thread to overtake where it usually would not.

The solution is to use two atomic operations: one to determine whether the active data set can be freed, then another to determine whether it can be freed by the calling thread.

The invalid read of size 8 message from Valgrind can still happen in the second operation, however this is not something to be concerned with. The method this is coming from is reading the memory location as part of an atomic operation where the value has been set to null (and possibly freed by the time it gets there). If it has been freed, it will still be null and the check will keep the double free safe. Even in the unlikely event that something else allocates and writes to that memory location, the value written is very unlikely to be the value we are comparing, so the check will still work as intended. For context, it is a 64 bit integer, so if we reloaded every second it would take 292,277,024,626 years for this to fail (and that's assuming that every single time we reload another thread gets in there and writes something).

ben51degrees commented 4 years ago

@johniez I forgot to tag you in the previous message. I will wait to hear from you before ticking this off my list. If you could confirm that this solves your segfaults, that would be great.

Thanks

johniez commented 4 years ago

Thanks for the effort made on this fix. I tried it and it runs smoothly under the valgrind. Unfortunately the code at the commit 7f0f21521e19e21e7b21fcea2f375fe10f22dc24 still have some troubles. Using examples/ReloadFromMemoryTrie.c with changed sleep(1) after reload to usleep(50) goes to inifinite loop when compiled with optimization off (gcc -O0). When all threads finished its work, reloading thread was in the infinite loop and never ended.

When I turned on the optimizations (gcc -O2), SIGSEGV happens:

#0  fiftyoneDegreesProviderCreateDeviceOffsets (provider=0x7f5b691941c0 <provider>) at ../src/trie/51Degrees.c:3077
#1  0x00007f5b68f8e62f in runRequests (inputFile=<optimized out>) at ReloadFromMemoryTrie.c:319
#2  0x00007f5b68b514a4 in start_thread (arg=0x7f5af9feb700) at pthread_create.c:456
#3  0x00007f5b68893d0f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97

johniez commented 4 years ago

More info from core dump:

(gdb) p *provider
$3 = {active = 0x7f5b6981e5a0, lock = {initValue = 0, mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, 
      __size = '\000' <repeats 39 times>, __align = 0}}}

(gdb) p *(provider->active)
$5 = {self = 0x16, counter = {inUse = 1770135660, padding = 0x7f5b6982206c}, dataSet = 0x7f5b6981e420, provider = 0x7f5b691941c0 <provider>}

(gdb) p *(provider->active->dataSet)
$6 = {header = {version = 34, formatOffset = 22, nameOffset = 56, tag = "\360H", '\000' <repeats 13 times>, published = {year = 2020, month = 3 '\003', day = 4 '\004'}, nextUpdate = {year = 2020, 
      month = 3 '\003', day = 5 '\005'}, copyrightOffset = 6, maxStringLength = 3890}, memoryToFree = 0x7f5aed3c4010, fileName = 0x7f5b6981e510 "HashTrieV34.latest", 
  devicePropertiesCount = 1, baseDrift = 0, baseDifference = 0, devicesIntegerCount = 5, requiredPropertiesNames = 0x7f5b6981e580, prefixedUpperHttpHeaders = 0x0, components = {
    firstElement = 0x7f5b5f11be1c, count = 4, freeMemory = 0}, allProperties = {firstElement = 0x7f5b5f11be70, count = 150, freeMemory = 0}, httpHeaders = {firstElement = 0x7f5b5f11be30, count = 14, 
    freeMemory = 0}, uniqueHttpHeaders = {firstElement = 0x7f5b6981f7a0, count = 6, freeMemory = 1}, requiredProperties = {firstElement = 0x7f5b6981e560, count = 2, freeMemory = 0}, strings = {
    firstByte = 0x7f5b5c38403c "\004", count = 0}, profiles = {firstByte = 0x7f5b5f11ca2c "\377w", count = 3436522}, devices = {firstByte = 0x7f5b5fe389d8 "\f\326\004", count = 6902645}, nodes = {
    firstByte = 0x7f5b6188d7b0 "A\001", count = 0}}

At least provider->active->counter.inUse = 1770135660 seems to be suspiciously high.

ben51degrees commented 4 years ago

@johniez Thanks for checking and getting that info. I will take a look at why this is happening and get back to you.,

ben51degrees commented 4 years ago

@johniez I have just pushed some improvements which have passed all of my tests. There were a couple of small bugs, but not ones that would have readily occurred (but did when reloading every few milliseconds).

From running your tests, it seems the issue causing the segfault is on Line 74 where you use fiftyoneDegreesSetDeviceOffset.

This should take the data set linked to the offsets (which is kept safe until the offsets are freed), otherwise there is the risk that the data set passed in is freed while being used. So instead of:

fiftyoneDegreesSetDeviceOffset(dataSet, userAgent.c_str(), 0, offsets->firstOffset);

it should be:

fiftyoneDegreesSetDeviceOffset(offsets->active->dataSet, userAgent.c_str(), 0, offsets->firstOffset);

johniez commented 4 years ago

Thank you, I will run tests soon, although it will probably run without problems now.

Thanks even for discovering bug at my code (dataSet -> offsets->active->dataSet). I found that last week when I was comparing against official example once more and I just forgot to commit it to my forked repository. Sorry for that, this could have save some time to you.

johniez commented 4 years ago

I was just running tests and have to say that example/ReloadFromMemoryTrie.c works fine even with excessive reloading. Only helgrind is complaining about possible data race in fiftyoneDegreesDataSetFree (maybe just because different thread allocated that memory).

Unfortunately the code at my fork is still crashing when optimization -O2 is enabled. When -O0 is used it is ok then. And I cannot see significant difference when compared to example/ReloadFromMemoryTrie.c.

Anyway, data reloading is not usually made in such frequency, and there could be some bug in fiftyone* function usage at my code (although I'm not seeing it). Consider it as solved, thanks :)

51Degrees / Device-Detection

Reload from file mem leak #46