GreycLab / CImg

The CImg Library is a small and open-source C++ toolkit for image processing
http://cimg.eu
Other
1.48k stars 282 forks source link

Fix multiple heap buffer overflows #295

Closed cleeus closed 3 years ago

cleeus commented 3 years ago

The size calculation pattern (size_t)size_xsize_ysize_z*size_c can overflow the resulting size_t. Especially on 32bit size_t platforms this is trivial and can be achieved using a simple PNM image, e.g. the following ASCII PNM would allocate only 6 byte and result in a trivial arbitrary heap write: P3 2147483649 2 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255

dtschump commented 3 years ago

That's an interesting idea, but the implementation of _safe_size() seems not optimal. As _safe_size() will probably be called a lot of times, attention must be paid to the extra cost in machine time. Maybe the test:

(size_t)size_x*size_y*size_z*size_c==(cimg_uint64)size_x*size_y*size_z*size_c;

should be enough in _safe_size()?

cleeus commented 3 years ago

I started with that simple check, but I don't think it is enough, because multiplying two 32bit values results in a 64bit value. So multiplying more then two 32bit values could also overflow the 64bit type. So for each 32bit multiplication you need a check. And then there often follows a multiplication with *sizeof(T) so this has to be checked, too, I think.

Since this check is often followed by a malloc, I don't think performance is an issue here. The time it takes to do these few multiplications is dwarfed by the subsequent malloc (malloc is orders of magnitudes slower).

dtschump commented 3 years ago

OK. it sounds good. Maybe I'll try to make _safe_size() a bit faster afterwards (e.g. the first multiplication don't have to be tested, if we start with size=size_x). Thanks for the contribution.

cleeus commented 3 years ago

I experimented a bit with different forms to express this and looked at the codege/asm output of MSVC2019/x64. It seems to store the arguments on the stack in the dim[] array. If you wan't to avoid this stack write, you can unroll the loop. The best codegen I could get was with this form:

        int overflows = 0;
        size_t size = size_x;
        overflows += (size_y > 1 && size * size_y <= size) ? 1 : 0;
        size *= size_y;
        overflows += (size_z > 1 && size * size_z <= size) ? 1 : 0;
        size *= size_z;
        overflows += (size_c > 1 && size * size_c <= size) ? 1 : 0;
        size *= size_c;

But as said, the perf impact compared to the malloc should be minuscule. Feel free to use above snippet. Thanks for merging the PR.

dtschump commented 3 years ago

I'm actually experimenting with an unrolled version.. I'm also moving CImg<T>::_safe_size() to cimg::size(), because cimg:: is the namespace used to store such convenience functions.

dtschump commented 3 years ago

I've ended up with:

    // Return 'dx*dy*dz*dc' as a 'size_t' and check no overflow occurs.
    static size_t safe_size(const unsigned int dx, const unsigned int dy,
                            const unsigned int dz, const unsigned int dc) {
      if (!(dx && dy && dz && dc)) return 0;
      size_t siz = (size_t)dx, osiz = siz;
      if ((dy==1 || (siz*=dy)>osiz) &&
          ((osiz = siz), dz==1 || (siz*=dz)>osiz) &&
          ((osiz = siz), dc==1 || (siz*=dc)>osiz) &&
          ((osiz = siz), sizeof(T)==1 || (siz*sizeof(T))>osiz)) return siz;
      throw CImgArgumentException("CImg<%s>::safe_size(): Specified size (%u,%u,%u,%u) overflows 'size_t'.",
                                  pixel_type(),dx,dy,dz,dc);
      return 0;
    }

and finally put it as a static function in CImg<T>. Does it look good to you ?

cleeus commented 3 years ago

To be honest: I have a hard time parsing this C syntax and understanding it (sequence points, comma-operator, assignment and in-place operators inside if clause) and it generates nearly the same assembly as the simple unrolled form above. I suggest to lean on the side of readability. It seems to be correct though.

Here is the ASM output from your if(...) version:

00007FF646F38A57  test        edx,edx  
00007FF646F38A59  je          cimg_library::CImg<char>::_safe_size+65h (07FF646F38AB5h)  
00007FF646F38A5B  test        r8d,r8d  
00007FF646F38A5E  je          cimg_library::CImg<char>::_safe_size+65h (07FF646F38AB5h)  
00007FF646F38A60  test        r9d,r9d  
00007FF646F38A63  je          cimg_library::CImg<char>::_safe_size+65h (07FF646F38AB5h)  
00007FF646F38A65  mov         r10d,dword ptr [size_c]  
00007FF646F38A6D  test        r10d,r10d  
00007FF646F38A70  je          cimg_library::CImg<char>::_safe_size+65h (07FF646F38AB5h)  
00007FF646F38A72  mov         ecx,edx  
00007FF646F38A74  mov         r8d,edx  
00007FF646F38A77  cmp         r11d,1  
00007FF646F38A7B  je          cimg_library::CImg<char>::_safe_size+36h (07FF646F38A86h)  
00007FF646F38A7D  imul        rcx,r11  
00007FF646F38A81  cmp         rcx,r8  
00007FF646F38A84  jbe         cimg_library::CImg<char>::_safe_size+6Ch (07FF646F38ABCh)  
00007FF646F38A86  mov         r8,rcx  
00007FF646F38A89  cmp         r9d,1  
00007FF646F38A8D  je          cimg_library::CImg<char>::_safe_size+4Bh (07FF646F38A9Bh)  
00007FF646F38A8F  mov         eax,r9d  
00007FF646F38A92  imul        rcx,rax  
00007FF646F38A96  cmp         rcx,r8  
00007FF646F38A99  jbe         cimg_library::CImg<char>::_safe_size+6Ch (07FF646F38ABCh)  
00007FF646F38A9B  mov         r8,rcx  
00007FF646F38A9E  cmp         r10d,1  
00007FF646F38AA2  je          cimg_library::CImg<char>::_safe_size+5Dh (07FF646F38AADh)  
00007FF646F38AA4  imul        rcx,r10  
00007FF646F38AA8  cmp         rcx,r8  
00007FF646F38AAB  jbe         cimg_library::CImg<char>::_safe_size+6Ch (07FF646F38ABCh)  
00007FF646F38AAD  mov         rax,rcx  

Here is the ASM output from the overflows+= sequence version from above:

00007FF678F08A9E  mov         r10d,1  
00007FF678F08AA4  mov         r11d,edx  
00007FF678F08AA7  mov         rbx,rcx  
00007FF678F08AAA  mov         eax,r8d  
00007FF678F08AAD  imul        rax,r11  
00007FF678F08AB1  mov         esi,r8d  
00007FF678F08AB4  mov         ebp,edx  
00007FF678F08AB6  cmp         r8d,r10d  
00007FF678F08AB9  jbe         cimg_library::CImg<char>::_safe_size+45h (07FF678F08AC5h)  
00007FF678F08ABB  cmp         rax,r11  
00007FF678F08ABE  ja          cimg_library::CImg<char>::_safe_size+45h (07FF678F08AC5h)  
00007FF678F08AC0  mov         r8d,r10d  
00007FF678F08AC3  jmp         cimg_library::CImg<char>::_safe_size+48h (07FF678F08AC8h)  
00007FF678F08AC5  xor         r8d,r8d  
00007FF678F08AC8  mov         edx,r9d  
00007FF678F08ACB  imul        rdx,rax  
00007FF678F08ACF  cmp         r9d,r10d  
00007FF678F08AD2  jbe         cimg_library::CImg<char>::_safe_size+5Eh (07FF678F08ADEh)  
00007FF678F08AD4  cmp         rdx,rax  
00007FF678F08AD7  ja          cimg_library::CImg<char>::_safe_size+5Eh (07FF678F08ADEh)  
00007FF678F08AD9  mov         eax,r10d  
00007FF678F08ADC  jmp         cimg_library::CImg<char>::_safe_size+60h (07FF678F08AE0h)  
00007FF678F08ADE  xor         eax,eax  
00007FF678F08AE0  mov         rcx,rdi  
00007FF678F08AE3  add         r8d,eax  
00007FF678F08AE6  imul        rcx,rdx  
00007FF678F08AEA  cmp         edi,r10d  
00007FF678F08AED  jbe         cimg_library::CImg<char>::_safe_size+74h (07FF678F08AF4h)  
00007FF678F08AEF  cmp         rcx,rdx  
00007FF678F08AF2  jbe         cimg_library::CImg<char>::_safe_size+77h (07FF678F08AF7h)  

As you can see, there is no big difference. Both generate a similar amount and type of instructions.

cleeus commented 3 years ago

@dtschump this is now tracked by RedHat as CVE-2020-25693