google / zopfli

Zopfli Compression Algorithm is a compression library programmed in C to perform very good, but slow, deflate or zlib compression.
Apache License 2.0
3.43k stars 329 forks source link

OptimizeHuffmanForRle #150

Open JPeterMugaas opened 6 years ago

JPeterMugaas commented 6 years ago

When compiling zopfli for Win64 using MINGW-W64, I got a warning about invalid typecasting and a malloc value being out of potential range. I fixed this issue with a patch I'm proving. Please feel free to use it. The thing is that in Win64, malloc takes an 8-byte unsigned value while in Win32, it takes a 4-byte value

. Int is always 4 bytes even in Win64. That effects OptimizeHuffmanForRle mostly.

win32-fixes.patch.txt

lvandeve commented 6 years ago

Thanks for pointing this out. The warning afaik was due to length being a signed integer. Does commit 56c07b9399b3df1a5b430429be8debed55132029 fix it for you?

JPeterMugaas commented 6 years ago

I don't think so. When I compiled for Win32 using GCC, I got this:

42%] Building C object CMakeFiles/zopflilib_obj.dir/src/zopfli/lz77.c.obj D:/msys64/home/jpmugaas/exp/mingw-w64-zopfli/src/zopfli-zopfli-1.0.2/src/zopfli/deflate.c: In function 'OptimizeHuffmanForRle': D:/msys64/home/jpmugaas/exp/mingw-w64-zopfli/src/zopfli-zopfli-1.0.2/src/zopfli/deflate.c:452:16: warning: argument 1 value '4294967292' exceeds maximum object size 2147483647 [-Walloc-size-larger-than=] good_for_rle = (int)malloc((unsigned)length sizeof(int));


In file included from D:/msys64/home/jpmugaas/exp/mingw-w64-zopfli/src/zopfli-zopfli-1.0.2/src/zopfli/lz77.h:28:0,
                 from D:/msys64/home/jpmugaas/exp/mingw-w64-zopfli/src/zopfli-zopfli-1.0.2/src/zopfli/deflate.h:28,
                 from D:/msys64/home/jpmugaas/exp/mingw-w64-zopfli/src/zopfli-zopfli-1.0.2/src/zopfli/deflate.c:20:
D:/msys64/mingw32/i686-w64-mingw32/include/stdlib.h:503:17: note: in a call to allocation function 'malloc' declared here
   void *__cdecl malloc(size_t _Size);
                 ^~~~~~
[ 47%] Building C object CMakeFiles/zopflilib_obj.dir/src/zopfli/util.c.obj
[ 52%] Building C object CMakeFiles/zopflilib_obj.dir/src/zopfli/zlib_container.c.obj
[ 57%] Building C object CMakeFiles/zopflilib_obj.dir/src/zopfli/zopfli_lib.c.obj
[ 57%] Built target zopflilib_obj

I'm rather surprised by this considering I was compiling a version for the 32-bit version of Windows.  My build process builds for both using two separate toolchains.  The thing you have to keep in mind is that in Windows 64 bit, the pointer is 8-bytes wide as well as size_t and ptrdiff_t but the "int" is always 4-bytes wide.
jibsen commented 6 years ago

Isn't the problem rather that in the loop right above length is counted down until it reaches -1. The compiler cannot see if the loop will always break early, so it warns that multiplying -1 with sizeof(int) is unlikely to be meaningful?

Perhaps something like:

  for (; length > 0; --length) {
    if (counts[length - 1] != 0) {
      /* Now counts[0..length - 1] does not have trailing zeros. */
      break;
    }
  }
  if (length <= 0) {
    return;
  }
JPeterMugaas commented 6 years ago

I still have to run a test, but I have to ask this question, why does the procedure have this declaration?

void OptimizeHuffmanForRle(int length, size_t* counts)

The problem in Windows 64-bit is that an int can only accommodate 2GB's of data (2 to 31's power). If you use a size_t for the length parameter, you could get up 2 to the 63rd power as a maximum data size.

JPeterMugaas commented 6 years ago

I would think the correct type might be a ssize_t (signed size_t).

JPeterMugaas commented 6 years ago

Well, anyway, I did test your proposed loop with the compiler and it doesn't give warnings. But I still don't like that procedure because of that concern I mentioned. I'm going to look at something else.

lvandeve commented 6 years ago

I got that warning too at some point, and for me making sure that the malloc gets something unsigned fixed it, it seemed as if the warning came from having int. Though you still seem to get it for (unsigned)length * sizeof(int). Maybe casting the entire argument to (unsigned) helps?

No need to have different integer types for different platforms with #ifdefs imho. We'll find a type that works for everyone :)

The length is always small, it's bounded to the max amount of huffman symbols, so 32 bits are more than enough.

No need to change the signature of the function, that may only cause warnings elsewhere.

JPeterMugaas commented 6 years ago

Please forgive any confusion on my part since I don't know the algorithm. But the reason I was thinking some things needed to be 8 bytes was that I know that an "int" is 8-bytes wide on many 64-bit systems (not 4-bytes like Windows).

lvandeve commented 6 years ago

8 bytes are definitely not needed in this algorithm, a cast to fix the warning is good here.

Does changing good_for_rle = (int)malloc((unsigned)length sizeof(int)); into good_for_rle = (int)malloc((unsigned)(length sizeof(int))); or good_for_rle = (int)malloc((size_t)(length sizeof(int)));

fix the warning in both win64 and win32 for you?

Thanks

JPeterMugaas commented 6 years ago

good_for_rle = (int)malloc((unsigned)(length sizeof(int)));

Gives warning for Win32 but not Win64.

D:/msys64/home/jpmugaas/exp/mingw-w64-zopfli/src/zopfli-zopfli-1.0.2/src/zopfli/deflate.c: In function 'OptimizeHuffmanForRle': D:/msys64/home/jpmugaas/exp/mingw-w64-zopfli/src/zopfli-zopfli-1.0.2/src/zopfli/deflate.c:452:16: warning: argument 1 value '4294967292' exceeds maximum object size 2147483647 [-Walloc-size-larger-than=] good_for_rle = (int)malloc((unsigned)(length sizeof(int)));


In file included from D:/msys64/home/jpmugaas/exp/mingw-w64-zopfli/src/zopfli-zopfli-1.0.2/src/zopfli/lz77.h:28:0,
                 from D:/msys64/home/jpmugaas/exp/mingw-w64-zopfli/src/zopfli-zopfli-1.0.2/src/zopfli/deflate.h:28,
                 from D:/msys64/home/jpmugaas/exp/mingw-w64-zopfli/src/zopfli-zopfli-1.0.2/src/zopfli/deflate.c:20:
D:/msys64/mingw32/i686-w64-mingw32/include/stdlib.h:503:17: note: in a call to allocation function 'malloc' declared here
   void *__cdecl malloc(size_t _Size);
                 ^~~~~~
===
good_for_rle = (int*)malloc((size_t)(length * sizeof(int)));

Gives warning in Win64 and Win32:

zopfli/src/zopfli/deflate.c:452:24: warning: argument 1 range [18446744065119617024, 18446744073709551612] exceeds maximum object size 9223372036854775807 [-Walloc-size-larger-than=]
   good_for_rle = (int*)malloc((size_t)(length * sizeof(int)));
                        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from zopfli/src/zopfli/lz77.h:28,
                 from zopfli/src/zopfli/deflate.h:28,
                 from zopfli/src/zopfli/deflate.c:20:
D:/msys64/mingw64/x86_64-w64-mingw32/include/stdlib.h:503:17: note: in a call to allocation function 'malloc' declared here
   void *__cdecl malloc(size_t _Size);
                 ^~~~~~

Win32 warning:
zopfli/src/zopfli/deflate.c: In function 'OptimizeHuffmanForRle':
zopfli/src/zopfli/deflate.c:452:16: warning: argument 1 value '4294967292' exceeds maximum object size 2147483647 [-Walloc-size-larger-than=]
   good_for_rle = (int*)malloc((size_t)(length * sizeof(int)));
   ~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from zopfli/src/zopfli/lz77.h:28:0,
                 from zopfli/src/zopfli/deflate.h:28,
                 from zopfli/src/zopfli/deflate.c:20:
D:/msys64/mingw32/i686-w64-mingw32/include/stdlib.h:503:17: note: in a call to allocation function 'malloc' declared here
   void *__cdecl malloc(size_t _Size);

At this point, I'm thinking that maybe the problem isn't with the unsigned parameter to malloc but what malloc may be returning which is a pointer.

Here's the definition for Microsoft's website:

void *malloc(  
   size_t size   
);  
jibsen commented 5 years ago

I got a ping from a comment on the related commit, which made me take another look at this. I realized OptimizeHuffmanForRle is only ever called with fixed values as length of either ZOPFLI_NUM_LL (288) or ZOPFLI_NUM_D (32). So we could use a fixed size array like in TryOptimizeHuffmanForRle to avoid the malloc in the first place.

I have a branch with the change here. Happy to make a PR if there is any interest, but it's only 3 lines changed.