troydhanson / uthash

C macros for hash tables and more
Other
4.18k stars 926 forks source link

Clang reports use after free when deleting all hashmap contents #128

Open Dorregaray opened 7 years ago

Dorregaray commented 7 years ago

The clang reports "use after free" when analyzing the following piece of code:

#include <stdio.h>   /* gets */
#include <stdlib.h>  /* atoi, malloc */
#include <string.h>  /* strcpy */
#include "uthash.h"

struct my_struct {
    int id;                    /* key */
    char name[10];
    UT_hash_handle hh;         /* makes this structure hashable */
};

struct my_struct *users = NULL;

void add_user(int user_id, char *name) {
    struct my_struct *s;

    HASH_FIND_INT(users, &user_id, s);  /* id already in the hash? */
    if (s==NULL) {
      s = (struct my_struct*)malloc(sizeof(struct my_struct));
      s->id = user_id;
      HASH_ADD_INT( users, id, s );  /* id: name of key field */
    }
    strcpy(s->name, name);
}

void delete_all() {
  struct my_struct *current_user, *tmp;

  HASH_ITER(hh, users, current_user, tmp) {
    HASH_DEL(users, current_user);  /* delete it (users advances to next) */
    free(current_user);             /* free it */
  }
}

int main(int argc, char *argv[]) {
    char in[10];
    int id=1, running=1;
    struct my_struct *s;
    unsigned num_users;

    add_user(1, "111111");
    add_user(2, "222222222");
    add_user(3, "3");

    delete_all();  /* free any structures */
    return 0;
}

Clang report:

uthash.c:30:5: warning: Use of memory after it is freed
    HASH_DEL(users, current_user);  /* delete it (users advances to next) */
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./uthash.h:421:5: note: expanded from macro 'HASH_DEL'
    HASH_DELETE(hh,head,delptr)
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~
./uthash.h:369:5: note: expanded from macro 'HASH_DELETE'
    HASH_DELETE_HH(hh, head, &(delptr)->hh)
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./uthash.h:376:17: note: expanded from macro 'HASH_DELETE_HH'
    uthash_free((head)->hh.tbl->buckets,                                         \
    ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./uthash.h:89:34: note: expanded from macro 'uthash_free'
#define uthash_free(ptr,sz) free(ptr)     /* free fcn                        */
                                 ^
uthash.c:30:5: warning: Use of memory after it is freed
    HASH_DEL(users, current_user);  /* delete it (users advances to next) */
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./uthash.h:421:5: note: expanded from macro 'HASH_DEL'
    HASH_DELETE(hh,head,delptr)
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~
./uthash.h:369:5: note: expanded from macro 'HASH_DELETE'
    HASH_DELETE_HH(hh, head, &(delptr)->hh)
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./uthash.h:382:23: note: expanded from macro 'HASH_DELETE_HH'
    if (_hd_hh_del == (head)->hh.tbl->tail) {                                    \
                      ^~~~~~~~~~~~~~
2 warnings generated.
Quuxplusone commented 7 years ago

Seems similar to the false positive in #111. Can you find out for me whether this is a true positive or not? One way to do that would be to #define uthash_free(p, n) to something that overwrites all the freed memory with 0x00 before deallocating it, and then look for segfaults.

If this is a false positive in the static analyzer, then we can't really do anything about it — unless you have a patch that fixes it, and the patch doesn't hurt the performance or clarity of the code, and you don't mind it bit-rotting. I mean, Clang could start complaining about ten new things tomorrow; uthash has no control over that.

jhi commented 7 years ago

... or compile with clang -fsanitize=address and run the code, and see if the sanitizer runtime catches the same problem. If it does, then uthash needs changing. If it does not, that is more ammunition for reporting the false positive to clang.

Dorregaray commented 7 years ago

It is false positive. Valgrind and sanitizer doesn't see the problem. Redefining uthash_free also didn't segfault.

acbits commented 1 year ago

Hello,

I am a developer of gcc plugins and I came across uthash and wanted to test it. I do see use-after-free for the last element deleted.

My testcase was add 3 entries and delete them like this

 HASH_ITER(hh, ht, cur_elem, tmp){
        printf("---BEGIN LOOP BODY---\n");
        printf("Deleting entry:|%p|:key:%d, RC:%d\n", cur_elem, cur_elem->key, REFTRACK_COUNT(cur_elem));
        HASH_DEL(ht, cur_elem);
        my_struct_removeref(cur_elem);
        printf("---END LOOP BODY---\n");
    }

where ht is the hash table, REFTRACK_COUNT() returns the number of active references and my_struct_removeref() is code instrumentation that removes a reference to the given element. When there are no more references to a heap allocated object, the object would be deleted. The +1, -1 in the trace log below implies addition of a reference and removal of a reference.

The trace from the instrumented code is

 Allocated |0x0x23072b8| of size |64| bytes at |hash_test.c:15|
my_struct:|0x0x23072b8|:+1
my_struct:|0x0x23072b8|:+1
my_struct:|0x0x23072b8|:+1
my_struct:|0x0x23072b8|:-1
 Allocated |0x0x2307988| of size |64| bytes at |hash_test.c:15|
my_struct:|0x0x2307988|:+1
my_struct:|0x0x2307988|:+1
my_struct:|0x0x2307988|:-1
 Allocated |0x0x23079e8| of size |64| bytes at |hash_test.c:15|
my_struct:|0x0x23079e8|:+1
my_struct:|0x0x23079e8|:+1
my_struct:|0x0x23079e8|:-1
my_struct:|0x0x23072b8|:+1
my_struct:|0x0x2307988|:+1
---BEGIN LOOP BODY---
Deleting entry:|0x23072b8|:key:1, RC:3
my_struct:|0x0x23072b8|:-1
---END LOOP BODY---
my_struct:|0x0x2307988|:+1
my_struct:|0x0x23072b8|:-1
my_struct:|0x0x23079e8|:+1
my_struct:|0x0x2307988|:-1
---BEGIN LOOP BODY---
Deleting entry:|0x2307988|:key:2, RC:2
my_struct:|0x0x2307988|:-1
---END LOOP BODY---
my_struct:|0x0x23079e8|:+1
my_struct:|0x0x2307988|:-1
 releasing object |0x0x2307988| type |my_struct|
my_struct:|0x0x23079e8|:-1
---BEGIN LOOP BODY---
Deleting entry:|0x23079e8|:key:3, RC:2
my_struct:|0x0x23079e8|:-1
my_struct:|0x0x23079e8|:-1
 releasing object |0x0x23079e8| type |my_struct|
---END LOOP BODY---
 Invalid pointer/use-after-free |0x0x23079e8| to |my_struct_removeref|

Further, if I comment out the HASH_DEL() statement, there are no issues. I hope this info is helpful in tracking down the issue.

Quuxplusone commented 1 year ago

@acbits: That doesn't seem related to this issue about a Clang-static-analyzer false positive. Please post a complete, compilable, minimal example demonstrating your problem, as a new issue. But I hope you have already noticed the historical pattern with reports of memory-safety bugs in C code: The mistake will turn out to be in your own code; the best you can hope for is that I (or someone) will be able to pinpoint your bug for you.

In fact, we can already tell that the output you pasted doesn't come from the loop you pasted. In the loop, you do printf("---END LOOP BODY---\n"); and then immediately printf("---BEGIN LOOP BODY---\n"); to begin the next iteration. In the output, you can see lines appearing in between iterations, like this:

---END LOOP BODY---
my_struct:|0x0x2307988|:+1
my_struct:|0x0x23072b8|:-1
my_struct:|0x0x23079e8|:+1
my_struct:|0x0x2307988|:-1
---BEGIN LOOP BODY---

Your program might secretly be doing something with multiple threads. Make sure every thread is done using the hash table before you try to destroy it.

acbits commented 1 year ago

Please post a complete, compilable, minimal example demonstrating your problem, as a new issue. But I hope you have already noticed the historical pattern with reports of memory-safety bugs in C code: The mistake will turn out to be in your own code; the best you can hope for is that I (or someone) will be able to pinpoint your bug for you.

The reason I didn't post the code is because it uses a gcc attribute that is not available to public yet.(Might change in future)

#define REFTRACK_DEBUG
#define REFTRACK_TRACE

#include "hrcmm.h"
#include "uthash.h"
#include <assert.h>

typedef int key_t;

REFTRACK_STRUCT(my_struct){
    key_t key;
    float value;
    UT_hash_handle hh;
};
REFTRACK_EPILOG(my_struct);

typedef struct my_struct my_struct;

int main(int argc, char *argv[]){
    my_struct *ht = NULL;

    int entries = 3;
    atexit(print_mem_stats);

    for(int i = 0; i < entries; i++){
        my_struct *vo = my_struct_create();
        vo->key = i+1;
        vo->value = vo->key*0.5;
        // need to add reference manually as uthash doesn't preserve the type of my_struct internally.
        my_struct_addref(vo); 
        HASH_ADD_INT(ht, key, vo);
    }

    assert(HASH_COUNT(ht) == entries);

    my_struct *tmp = NULL;
    key_t search_key = entries;
    HASH_FIND_INT(ht, &search_key, tmp);

    if (tmp){
        assert(tmp->value == (entries)*0.5);
    }
    else{
        printf("Entry not found:%d\n", search_key);
    }

    // delete all entries

    my_struct *cur_elem = NULL;

    HASH_ITER(hh, ht, cur_elem, tmp){
        printf("---BEGIN LOOP BODY---\n");
        printf("Deleting entry:|%p|:key:%d, RC:%d\n", cur_elem, cur_elem->key, REFTRACK_COUNT(cur_elem));
        HASH_DEL(ht, cur_elem);
        my_struct_removeref(cur_elem);
        printf("---END LOOP BODY---\n");
    }

}

In fact, we can already tell that the output you pasted doesn't come from the loop you pasted. In the loop, you do printf("---END LOOP BODY---\n"); and then immediately printf("---BEGIN LOOP BODY---\n"); to begin the next iteration. In the output, you can see lines appearing in between iterations, like this:

That is correct. The log inbetween ---END LOOP BODY--- and ---BEGIN LOOP BODY--- is generated by instrumented code.

BTW, I don't use uthash. I just selected uthash as test case for the gcc plugin and encountered the same issue as reported here.

I provided the trace so that it would be helpful to investigate.