CTU-IIG / kcf

Kernelized Correlation Filter tracker
13 stars 6 forks source link

FHOG calculation corrupts memory #7

Closed wentasah closed 6 years ago

wentasah commented 6 years ago

When OpenCV is compiled without hardware acceleration, one can use valgrind to detect errors. The following errors are reported:

==24616== Invalid read of size 16
==24616==    at 0x154E6C: _mm_loadu_ps (xmmintrin.h:934)
==24616==    by 0x154E6C: LDu(float const&) (sse.hpp:23)
==24616==    by 0x153196: gradHist(float*, float*, float*, int, int, int, int, int, bool) (gradientMex.cpp:209)
==24616==    by 0x154B1C: fhog(float*, float*, float*, int, int, int, int, int, float) (gradientMex.cpp:310)
==24616==    by 0x1432E1: FHoG::extract(cv::Mat const&, int, int, int, int, float) (fhog.hpp:68)
==24616==    by 0x13A9A4: KCF_Tracker::get_features(cv::Mat&, cv::Mat&, int, int, int, int, double) (kcf.cpp:488)
==24616==    by 0x139C2C: KCF_Tracker::track(cv::Mat&) (kcf.cpp:443)
==24616==    by 0x1308A1: main (main_vot.cpp:147)
==24616==  Address 0x1f3cc67c is 145,500 bytes inside a block of size 145,512 alloc'd
==24616==    at 0x48377D5: calloc (vg_replace_malloc.c:711)
==24616==    by 0x154D10: wrCalloc(unsigned long, unsigned long) (wrappers.hpp:22)
==24616==    by 0x154AE6: fhog(float*, float*, float*, int, int, int, int, int, float) (gradientMex.cpp:309)
==24616==    by 0x1432E1: FHoG::extract(cv::Mat const&, int, int, int, int, float) (fhog.hpp:68)
==24616==    by 0x13A9A4: KCF_Tracker::get_features(cv::Mat&, cv::Mat&, int, int, int, int, double) (kcf.cpp:488)
==24616==    by 0x139C2C: KCF_Tracker::track(cv::Mat&) (kcf.cpp:443)
==24616==    by 0x1308A1: main (main_vot.cpp:147)
==24616== 
==24616== Invalid write of size 8
==24616==    at 0x154E96: _mm_storeu_ps (xmmintrin.h:983)
==24616==    by 0x154E96: STRu(float&, float __vector(4)) (sse.hpp:26)
==24616==    by 0x1531AE: gradHist(float*, float*, float*, int, int, int, int, int, bool) (gradientMex.cpp:209)
==24616==    by 0x154B1C: fhog(float*, float*, float*, int, int, int, int, int, float) (gradientMex.cpp:310)
==24616==    by 0x1432E1: FHoG::extract(cv::Mat const&, int, int, int, int, float) (fhog.hpp:68)
==24616==    by 0x13A9A4: KCF_Tracker::get_features(cv::Mat&, cv::Mat&, int, int, int, int, double) (kcf.cpp:488)
==24616==    by 0x139C2C: KCF_Tracker::track(cv::Mat&) (kcf.cpp:443)
==24616==    by 0x1308A1: main (main_vot.cpp:147)
==24616==  Address 0x1f3cc684 is 145,508 bytes inside a block of size 145,512 alloc'd
==24616==    at 0x48377D5: calloc (vg_replace_malloc.c:711)
==24616==    by 0x154D10: wrCalloc(unsigned long, unsigned long) (wrappers.hpp:22)
==24616==    by 0x154AE6: fhog(float*, float*, float*, int, int, int, int, int, float) (gradientMex.cpp:309)
==24616==    by 0x1432E1: FHoG::extract(cv::Mat const&, int, int, int, int, float) (fhog.hpp:68)
==24616==    by 0x13A9A4: KCF_Tracker::get_features(cv::Mat&, cv::Mat&, int, int, int, int, double) (kcf.cpp:488)
==24616==    by 0x139C2C: KCF_Tracker::track(cv::Mat&) (kcf.cpp:443)
==24616==    by 0x1308A1: main (main_vot.cpp:147)
==24616== 
  -> speed : 12830.6ms. per frame, accuracy: 0.576451
==24616== Invalid write of size 8
==24616==    at 0x154E96: _mm_storeu_ps (xmmintrin.h:983)
==24616==    by 0x154E96: STRu(float&, float __vector(4)) (sse.hpp:26)
==24616==    by 0x153105: gradHist(float*, float*, float*, int, int, int, int, int, bool) (gradientMex.cpp:208)
==24616==    by 0x154B1C: fhog(float*, float*, float*, int, int, int, int, int, float) (gradientMex.cpp:310)
==24616==    by 0x1432E1: FHoG::extract(cv::Mat const&, int, int, int, int, float) (fhog.hpp:68)
==24616==    by 0x13A9A4: KCF_Tracker::get_features(cv::Mat&, cv::Mat&, int, int, int, int, double) (kcf.cpp:488)
==24616==    by 0x138B34: KCF_Tracker::track(cv::Mat&) (kcf.cpp:364)
==24616==    by 0x1308A1: main (main_vot.cpp:147)
==24616==  Address 0x1f110198 is 0 bytes after a block of size 145,512 alloc'd
==24616==    at 0x48377D5: calloc (vg_replace_malloc.c:711)
==24616==    by 0x154D10: wrCalloc(unsigned long, unsigned long) (wrappers.hpp:22)
==24616==    by 0x154AE6: fhog(float*, float*, float*, int, int, int, int, int, float) (gradientMex.cpp:309)
==24616==    by 0x1432E1: FHoG::extract(cv::Mat const&, int, int, int, int, float) (fhog.hpp:68)
==24616==    by 0x13A9A4: KCF_Tracker::get_features(cv::Mat&, cv::Mat&, int, int, int, int, double) (kcf.cpp:488)
==24616==    by 0x138B34: KCF_Tracker::track(cv::Mat&) (kcf.cpp:364)
==24616==    by 0x1308A1: main (main_vot.cpp:147)

Address sanitizer reports the same:

==25629==ERROR: AddressSanitizer: unknown-crash on address 0x7f7fc258105c at pc 0x5610860ce9d1 bp 0x7ffea78965f0 sp 0x7ffea78965e8
READ of size 16 at 0x7f7fc258105c thread T0
    #0 0x5610860ce9d0 in _mm_loadu_ps(float const*) /usr/lib/gcc/x86_64-linux-gnu/8/include/xmmintrin.h:934
    #1 0x5610860ce9d0 in LDu(float const&) ../src/piotr_fhog/sse.hpp:23
    #2 0x5610860cb2ba in gradHist(float*, float*, float*, int, int, int, int, int, bool) ../src/piotr_fhog/gradientMex.cpp:209
    #3 0x5610860ce4bb in fhog(float*, float*, float*, int, int, int, int, int, float) ../src/piotr_fhog/gradientMex.cpp:310
    #4 0x5610860a1df7 in FHoG::extract(cv::Mat const&, int, int, int, int, float) ../src/piotr_fhog/fhog.hpp:68
    #5 0x56108608b9dd in KCF_Tracker::get_features(cv::Mat&, cv::Mat&, int, int, int, int, double) ../src/kcf.cpp:488
    #6 0x561086088b6f in KCF_Tracker::track(cv::Mat&) ../src/kcf.cpp:443
    #7 0x56108606fad1 in main ../main_vot.cpp:147
    #8 0x7f7fe3a1eb16 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x22b16)
    #9 0x56108606dc79 in _start (/home/wsh/students/karafvit/kcf/build-fftw/kcf_vot+0x2bc79)

0x7f7fc2581068 is located 0 bytes to the right of 145512-byte region [0x7f7fc255d800,0x7f7fc2581068)
allocated by thread T0 here:
    #0 0x7f7fe96910b8 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xe90b8)
    #1 0x5610860ce796 in wrCalloc(unsigned long, unsigned long) ../src/piotr_fhog/wrappers.hpp:22
    #2 0x5610860ce485 in fhog(float*, float*, float*, int, int, int, int, int, float) ../src/piotr_fhog/gradientMex.cpp:309
    #3 0x5610860a1df7 in FHoG::extract(cv::Mat const&, int, int, int, int, float) ../src/piotr_fhog/fhog.hpp:68
    #4 0x56108608b9dd in KCF_Tracker::get_features(cv::Mat&, cv::Mat&, int, int, int, int, double) ../src/kcf.cpp:488
    #5 0x561086088b6f in KCF_Tracker::track(cv::Mat&) ../src/kcf.cpp:443
    #6 0x56108606fad1 in main ../main_vot.cpp:147
    #7 0x7f7fe3a1eb16 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x22b16)

SUMMARY: AddressSanitizer: unknown-crash /usr/lib/gcc/x86_64-linux-gnu/8/include/xmmintrin.h:934 in _mm_loadu_ps(float const*)
Shadow bytes around the buggy address:
  0x0ff0784a81b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ff0784a81c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ff0784a81d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ff0784a81e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ff0784a81f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0ff0784a8200: 00 00 00 00 00 00 00 00 00 00 00[00]00 fa fa fa
  0x0ff0784a8210: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0ff0784a8220: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0ff0784a8230: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0ff0784a8240: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0ff0784a8250: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==25629==ABORTING
wentasah commented 6 years ago

It seems that this is related to pdollar/toolbox#8.

Shanigen commented 6 years ago

After trying both possible solutions in your referenced issue and in issue xuduo35/STAPLE#2, which makes the valgrind show no error but I am not really sure that it is calculating the correct result. I tried the other proposed solution, which seems to work as well and shows the same results as the original. The changes are following:

  1. #define GH(H,ma,mb) H1_64=(__m64*)(H); STRlow(*H1_64,ADD(LDlow(*H1_64),MUL(ma,mb))); in GradHist (You also have to add __m64 * H1_64 as variable)

  2. RETf LDlow( const __m64 & x ) {__m128 a; return _mm_loadl_pi(a,&x); } and RETf STRlow( __m64 &x, const __m128 y ) { _mm_storel_pi(&x,y); return y; } in sse.hpp.

One thing I am little bit unsure if it could be a problem in future is the uninitilized dummy variable in LDlow that I use, but the data on these positions is not used for any calculation as mentioned in pdollar/toolbox#8. If you know better way this can be done please tell me. These changes are not pushed yet.

wentasah commented 6 years ago

This is clearly wrong. Instead of performing calculations with 4 values in one instruction, it just calculates with 2 values. Because nothing else is changed it the code, it means that half of the computation performed originally is skipped and the result is most probably wrong.

See https://software.intel.com/sites/landingpage/IntrinsicsGuide/ for description of emmintrin.h functions.

wentasah commented 6 years ago

Can you try replacing

  R1 = (float*) wrCalloc(wb*hb*nOrients*2,sizeof(float));

at https://github.com/shanigen/kcf/blob/master/src/piotr_fhog/gradientMex.cpp#L309 with

  R1 = (float*) wrCalloc(wb*hb*nOrients*2 + 2,sizeof(float));

?

Shanigen commented 6 years ago

Tried the replacement and tested the tracker with valgrind, which shows no errors.