Open jkyang92 opened 3 years ago
@mikestillman -- you may want to handle this
@DanGrayson yes, I will. @jkyang92 Jay, thanks!
Also, before we stopped using GC to allocate all memory, I probably had a few spots where I allocated something, then forgot or didn't bother to free it. But now that is a leak. @jkyang92 if you find other cases, please let me know! (But i'll use valgrind too, via your code/pull request, myself too).
Here's a valgrind log after running the tests from NormalToricVarieties (directly, not via check). Mostly they seem to be in RingQQ::fraction
from aring-glue.hpp
Is the leak in https://github.com/Macaulay2/M2/issues/1728 related to this as well?
Using the code from pull request #1995, I got some better logs from ctest -T memcheck. Here are some more leaks and errors:
unit-tests:ARingGFGivaroGivaro.create: MemoryChecker.27.log unit-tests:ARingGFGivaroGivaro.arithmetic: MemoryChecker.29.log unit-tests:ARingQQGMP.display: MemoryChecker.34.log unit-tests:FreeAlgebra.quotientArithmetic: MemoryChecker.72.log unit-tests:RingZZmod101.negate: MemoryChecker.100.log
There's a bunch of other RingZZmodn tests that fail, but I think they are the same root cause.
Two other observations in my testing:
mpz_reallocate_limbs
use GC_MALLOC_ATOMIC
?make_pair<bool,int>
, this is spurious, and again is related to the fact that libgc reads uninitialized values. Unfortunately, due to the nature of this error, I can't figure out a good way to suppress it.
- Shouldn't
mpz_reallocate_limbs
useGC_MALLOC_ATOMIC
?
Yes, that would be an improvement.
1. Shouldn't `mpz_reallocate_limbs` use `GC_MALLOC_ATOMIC`?
I've noticed a few segfaults related to mpz_reallocate_limbs
(#1429, #1564, #1577, #1578) -- perhaps this would help those?
@d-torrance I'd be really surprised if it made a difference, GC_MALLOC_ATOMIC
just tells the garbage collector that the newly allocated memory cannot contain pointers, and so tells it not to bother scanning it for pointers.
On the other hand, those stack traces look suspiciously similar to weird issues I've been having with valgrind that I assumed were just spurious faults related to libgc. In particular, see point 2 from my previous comment. But that would require that GC_MALLOC
was returning gibberish. I suppose this could occur if we are writing past the end of some array, but I have no idea how to figure that out.
@d-torrance I did some testing, GC_MALLOC_ATOMIC
actually does seem to help, at least for #1564. Incidentally, GC_MALLOC_ATOMIC_IGNORE_OFF_PAGE
might be even better here, since I think we can guarantee that there is always a pointer to the beginning of the allocated memory.
@mikestillman As far as I can tell, all of the examples in this bug report still leak. Also, I'm curious why you chose to copy the limbs into the gc heap instead of using finalizers. I can't imagine that a finalizer is more expensive than an allocation+copy.
@jkyang92 The number of these elements we can have active can be in the millions. I was afraid (but didn't measure it) that finalization for that many elements would be quite slow... But maybe that is incorrect. We could try it. There might be some possibility of memory leaks still: if the gmp struct is not in gc allocated memory, and so we don't finallize it, and we do not clear that element, then the limbs will leak.
@jkyang92 OK, I'll take a look at these.
@mikestillman I did some testing with the following (highly artificial) test, and the results are mixed. So if the allocations are small, then using finalizers seems significantly slower, on the other hand if the allocations are large, then using finalizers is actually faster (likely due to lower GC heap usage).
#define GC_THREADS 1
#define GC_PTHREADS 1
#include <gc/gc.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define USE_FINALIZER 1
#define ALLOC_SIZE 10
#define BLOCK_SIZE 1000
#define LOOP_COUNT 100000
struct test{
void *memory;
};
void free_func(void * v, void *cd){
struct test *t = v;
free(t->memory);
t->memory = NULL;
}
struct test * allocate(){
#ifdef USE_FINALIZER
struct test *t = GC_MALLOC_ATOMIC(sizeof(*t));
t->memory = malloc(ALLOC_SIZE);
GC_REGISTER_FINALIZER(t,free_func,NULL,NULL,NULL);
#else
struct test *t = GC_MALLOC(sizeof(*t));
t->memory = GC_MALLOC_ATOMIC(ALLOC_SIZE);
#endif
return t;
};
int main(){
GC_INIT();
for(int i=0;i<LOOP_COUNT;i++){
struct test **block = GC_MALLOC(BLOCK_SIZE*sizeof(block[0]));
for(int j=0;j<BLOCK_SIZE;j++){
block[j] = allocate();
}
memset(block,0,BLOCK_SIZE*sizeof(block[0]));
}
GC_gcollect();
}
This doesn't perfectly replicate the GMP situation since I don't do any work on the memory, and I don't do the copying needed to use the GC heap in that case.
The following code leaks gmp memory:
valgrind gives the following stack trace
I've attached the full valgrind output. You can also verify that it's not a false positive by running the code in a loop and seeing that it does consume an increasing amount of memory.
gmpleak.log
This is not the only gmp related memory leak that I've seen. at least one of the tests from
NormalToricVarieties
also causes us to leak gmp memory. I suspect that what's happening while the struct underlying mpz_t itself is allocated on the GC heap, the limbs aren't always.