blachlylab / dklib

D ports of klib algorithms
MIT License
5 stars 5 forks source link

Segfault upon destruction using hashmap of hashmaps #2

Open charlesgregory opened 4 years ago

charlesgregory commented 4 years ago

In trying to make a hashmap of hashmaps using khash.d, I run into the issue of a segfault when using the require function. Segfault happens here as a double free (I think): https://github.com/blachlylab/dklib/blob/8b3fa929ab5dac2e526468493ab2f984a614168c/source/dklib/khash.d#L145-L147

The initval is an empty hashmap. Example:

auto kmers = khash!(string, khash!(string, int))();

// for loop {

    // if not present create empty hash
    // initval is empty hashmap
    kmers.require(kmer1, khash!(string, int)());

    // if not present make count zero
    kmers[kmer1].require(kmer2, 0);

    kmers[kmer1][kmer2]++;

//}

The fix I found was by removing the free statements in the destructor. My theory is that D's automatic destructor frees the pointers as soon as they are out of scope. Or the other theory is that the empty hashmaps are freed before ever having the pointers set to any memory and are therefore the pointers are null or invalid.

jblachly commented 4 years ago

Code LGTM, probably you are seeing a double-free since I did not disable copies. If you @disable this(this) and then attempt your test you should see a compilation failure indicating that a copy is not allowed. Can you confirm this?

jblachly commented 4 years ago

If you confirm this is the problem I will either disable copies or add copy constructor since that is a relatively new feature in Dlang

charlesgregory commented 4 years ago

Yes disabling copies provides me with a long list of compilation errors.

jblachly commented 4 years ago

@charlesgregory can you please provide example code of how you are using nested hashmaps to better understand how we can work around this? I am in favor of disabling copies, but we could potentially instead do refcounting

charlesgregory commented 4 years ago
auto kmers = khash!(string, khash!(string, int))();

 for loop {
    string kmer1;
    string kmer2;

    // if kmer1 not present create empty hashmap
    // initval is empty hashmap
    kmers.require(kmer1, khash!(string, int)()); // <- segfault when kfrees are commented out

    // if kmer2 not present make count zero
    kmers[kmer1].require(kmer2, 0);

    kmers[kmer1][kmer2]++;

}

This could potentially be solved by having special compile time checks for a value type that is another khash instance. Might be difficult.

jblachly commented 4 years ago

Yeah you are constructing the object in the parameter position and it is copied into the function.

jblachly commented 4 years ago

Changing the ctor function signature value parameter to include auto ref may fix this although ia m not sure whether should do that or write copy constructor (or both)