Marzac / le3d

A straightforward and easy to use 3D software renderer for real-time retro graphics.
https://marzac.github.io/le3d/
MIT License
60 stars 6 forks source link

ASAN errors #45

Closed m0ppers closed 6 years ago

m0ppers commented 6 years ago

Following error is being printed by ASAN on mac os x clang running the cube example:

configured like this:

hans-guenther:le3d mop$ cmake -DCMAKE_BUILD_TYPE=Debug -DCMAKE_CXX_FLAGS="-g -fsanitize=address -fno-omit-frame-pointer" ..

==1424==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x00010c7cc803 at pc 0x000109277322 bp 0x7ffee69c7d30 sp 0x7ffee69c7d28
READ of size 8 at 0x00010c7cc803 thread T0
    #0 0x109277321 in LeRasterizer::fillFlatTexZC(int, float, float, float, float, float, float, float, float) flattexzc.h:82
    #1 0x1092712cc in LeRasterizer::fillTriangleZC(int, int, int, bool) rasterizer_float.cpp:328
    #2 0x10926f066 in LeRasterizer::rasterList(LeTriList*) rasterizer_float.cpp:227
    #3 0x10923a380 in main cube.cpp:80
    #4 0x7fff7c20d014 in start (libdyld.dylib:x86_64+0x1014)

0x00010c7cc803 is located 3 bytes to the right of 262144-byte region [0x00010c78c800,0x00010c7cc800)
allocated by thread T0 here:
    #0 0x10959d777 in wrap_posix_memalign (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0x57777)
    #1 0x1092505e6 in _aligned_malloc global.cpp:136
    #2 0x1092a0e1b in operator new[](unsigned long) global.h:103
    #3 0x109249160 in LeBmpFile::readBitmap(__sFILE*, LeBitmap*) bmpfile.cpp:220
    #4 0x10924585e in LeBmpFile::load() bmpfile.cpp:118
    #5 0x109244550 in LeBmpCache::loadBMP(char const*) bmpcache.cpp:107
    #6 0x10924501c in LeBmpCache::loadDirectory(char const*) bmpcache.cpp:149
    #7 0x109238bf6 in main cube.cpp:29
    #8 0x7fff7c20d014 in start (libdyld.dylib:x86_64+0x1014)

SUMMARY: AddressSanitizer: heap-buffer-overflow flattexzc.h:82 in LeRasterizer::fillFlatTexZC(int, float, float, float, float, float, float, float, float)
Shadow bytes around the buggy address:
  0x1000218f98b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1000218f98c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1000218f98d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1000218f98e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1000218f98f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x1000218f9900:[fa]fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x1000218f9910: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x1000218f9920: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x1000218f9930: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x1000218f9940: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x1000218f9950: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==1424==ABORTING
m0ppers commented 6 years ago

adding to that:

disabling SIMD and SSE2 doesn't yield an error

Marzac commented 6 years ago

That's normal and wrong.

It's because texels are always read in packets of 64bit such as here: https://github.com/Marzac/le3d/blob/194e6b3f3217e2a5d09161e65ae586bfd196b2f7/engine/fillers/float/sse/flattexzc.h#L74

Only the lower 32bit of the data is actually used. If the very last pixel N is read, then the pixel N+1 will be read too ... and this pixel is out of bound ... I fix this problem tonight (at several places in the code).

Thank you for finding that mistake, peace!

Marzac commented 6 years ago

It seems there is no efficient way to load only 32bit data with SSE. It's totally normal since vectors are m64 or m128. The texture regions should then be aligned AND PADDED (not implemented) to match the MMX / SSE requirements.

Marzac commented 6 years ago

This is corrected in the incoming new version 1.8.