liuliu / ccv

C-based/Cached/Core Computer Vision Library, A Modern Computer Vision Library
http://libccv.org
Other
7.08k stars 1.71k forks source link

Memory corruption / undefined behaviour in SWT #176

Closed MrEsp closed 8 years ago

MrEsp commented 8 years ago

Hi,

I've discovered some serious problems in SWT implementation. Looks like it corrupts memory.
Here I've some added code to reproduce the problem, full description how to run it is in Readme.

As I can see, there are problems with canny implementation and caching code. Help is required to fix it.

One of the errors (valgrind output): ==6025== Invalid write of size 4 ==6025== at 0x5C2E09: ccv_canny (in /home/xxx/src/repo/libccv_fork/ccv/test/testapp/Debug/testapp) ==6025== by 0x40835E: ccv_swt (in /home/xxx/src/repo/libccv_fork/ccv/test/testapp/Debug/testapp) ==6025== by 0x4399CF: ccv_swt_detect_words (in /home/xxx/src/repo/libccv_fork/ccv/test/testapp/Debug/testapp) ==6025== by 0x4032E6: detectTextOnImage(ccv_dense_matrix_t*, ccv_swt_param_t const&, bool) (main.cpp:35) ==6025== by 0x403296: main (main.cpp:29) ==6025== Address 0x69cc790 is 560 bytes inside an unallocated block of size 162,768 in arena "client"

liuliu commented 8 years ago

Have you tried with ./bin/swtdetect.c and the ccv_swt_default_params settings? Will take a look at this later today.

liuliu commented 8 years ago

I cannot repro this problem on unstable branch with the given image and the same swt params with ./bin/swtdetect.c sample code. Can you give me a bit more information about your setup?

liuliu commented 8 years ago

Also tried with AddressSanitizer, other than Valgrind, no luck too.

MrEsp commented 8 years ago

@liuliu Hi,

That's my compiler and OS. g++ (Ubuntu 4.8.4-2ubuntu1~14.04.1) 4.8.4

I didn't try it with swtdetect.c, I will and let you know the result.

MrEsp commented 8 years ago

@liuliu Okay, i tested it via swtdetect tool. Here is what I got (a bit different from what I got previously):

==3858== Conditional jump or move depends on uninitialised value(s) ==3858== at 0x6DB05E: ccv_canny (ccv_classic.c:312) ==3858== by 0x41B213: ccv_swt (ccv_swt.c:49) ==3858== by 0x4325B7: ccv_swt_detect_words (ccv_swt.c:642) ==3858== by 0x401DE3: main (in /home/xxx/src/repo/libccv_fork/ccv/bin/swtdetect) ==3858== ==3858== Conditional jump or move depends on uninitialised value(s) ==3858== at 0x6DB06C: ccv_canny (ccv_classic.c:312) ==3858== by 0x41B213: ccv_swt (ccv_swt.c:49) ==3858== by 0x4325B7: ccv_swt_detect_words (ccv_swt.c:642) ==3858== by 0x401DE3: main (in /home/xxx/src/repo/libccv_fork/ccv/bin/swtdetect) ==3858== ==3858== Conditional jump or move depends on uninitialised value(s) ==3858== at 0x6DB078: ccv_canny (ccv_classic.c:312) ==3858== by 0x41B213: ccv_swt (ccv_swt.c:49) ==3858== by 0x4325B7: ccv_swt_detect_words (ccv_swt.c:642) ==3858== by 0x401DE3: main (in /home/xxx/src/repo/libccv_fork/ccv/bin/swtdetect) ==3858==

These errors were present on my test, but I guess they are not the worst. the worst one was - "invalid write". I can guess why it tells us "conditional jump" - you are using some kind of memset optimization: instead memsetting the entire array, you do it by parts, thus ,as I understood, saving cycles. Did you assume that conditional jump to happen in your code or it's a bug?

Other errors that I get in my example come from "caching function". I see that you are calling ccv_enable_default_cache();

and I'm not doing anything like that in my code. Perhaps this causes some conditional jumps to happen when it reaches

ccv_declare_derived_signature(sig, a->sig != 0, ccv_sign_with_format(64, "ccv_canny(%d,%la,%la)", size, low_thresh, high_thresh), a->sig, CCV_EOF_SIGN);
ccv_object_return_if_cached(, db);

So my first question: am I doing everything properly, must it be initialized in any way , the cache ? How it should be initialized when I don't need the cache ?

What is interesting: when I comment out the calls to ccv_declare_derived_signature and ccv_object_return_if_cached(, db); and also commenting out your partial memset calls, prolonging the very first memset initialization upto the array length (everything in ccv_canny) - valgrind is no longer showing me any errors. Moreover, not commenting out anything - I explicitly check for array bounds violation, and guess what I obtain ? Nothing! The code is not violating any array bounds, but valgrind says that it does! It can be some very tricky behaviour of valgrind when it changes "something" if conditional jumps are met. I am talking about the code like that:

CCV_ASSERT((map_ptr - map_cols - 1) >= mapAddressStart &&   ((map_ptr - map_cols - 1)+ (a->cols + 2) -1) <= mapAddressEnd, "memset error" );
        //This one has been commented out, memory is zeroed in the begining
        //memset(map_ptr - map_cols - 1, 0, sizeof(int) * (a->cols + 2));

        int dr[] = {-1, 1, -map_cols - 1, -map_cols, -map_cols + 1, map_cols - 1, map_cols, map_cols + 1};
        while (stack_top > stack_bottom)
        {
            map_ptr = *(--stack_top);
            if(!(stack_top >= stackAddressStart  && stack_top <= stackAddressEnd))
            {
                printf("Error caught");
            }
            //CCV_ASSERT(stack_top >= stackAddressStart  && stack_top <= stackAddressEnd, "stack_ptr out of chunk");
            for (i = 0; i < 8; i++) {
                if(!((map_ptr + dr[i]) >= mapAddressStart && (map_ptr + dr[i]) <= mapAddressEnd))
                {
                    printf("Error caught");
                }
                CCV_ASSERT((map_ptr + dr[i]) >= mapAddressStart && (map_ptr + dr[i]) <= mapAddressEnd, "map_ptr out of chunk");
                if (map_ptr[dr[i]] == 1) {
                    map_ptr[dr[i]] = 2;
                    *(stack_top++) = map_ptr + dr[i];
                }
            }
        }
        map_ptr = map + map_cols + 1;

These asserts and ifs are not hit. So probably valgrind can lie. But it's strange that you cannot reproduce it. I can do it consistently on both valgrind 3.10.1 and 3.11

So what do you think?

MrEsp commented 8 years ago

@liuliu

Looks like I found the exact parameters which affects "invalid read/write". It's swt_params.scale_invariant

I have 1 and you have 0 in swt default params. I am able to reproduce it , when I do a single change

swt_params = initSWTParams(swt_params); swt_params.scale_invariant = 0;

i.e. reseting scale_invariant to zero, I get no read/write errors in valgrind. I suppose the bug really exists. Setting scale_invariant=0 decreases quality significantly.

liuliu commented 8 years ago

https://github.com/liuliu/ccv/commit/f76f109ee33cece994de044aa8f95aaf75e9c237

This should fix Valgrind's complain. I did a wrong math, where the pointer should start with map_ptr - 1, rather than map_ptr - map_cols - 1, and thus, the last line is not properly memset.