ConorStokes / LZSSE

LZ77/LZSS designed for SSE based decompression
BSD 2-Clause "Simplified" License
134 stars 16 forks source link

Fixed compilation for Gcc 4.8.3 (MingW). #2

Closed TurtleSimos closed 8 years ago

TurtleSimos commented 8 years ago

Include order changed so size_t is defined for headers. Stopped pointer initialization on declaration - initialization was skipped by the goto, giving and error.

ConorStokes commented 8 years ago

Looks good, although for other GCC variants than MingW we'll need a different header for the intrinsics and __builtin_ctz instead of _BitScanForward.

ConorStokes commented 8 years ago

There is a version at https://github.com/powturbo/TurboBench/blob/master/LZSSE_/ that works on Linux GCC and MingW, but is broken on Visual Studio.

Interestingly, using that benchmark there is a bit of a speed drop off vs the Visual Studio version that will probably require some looking over the assembly listings.

TurtleSimos commented 8 years ago

I wanted to avoid introducing compiler platform detection, as it probably would be better isolating that in one new header file for all 3 compressors to share. Whether to use one, or duplicate is more a style choice I'd leave to the owner of the code.

I'm using MingW to build lzbench as I hit the same issues you mentioned with code not being compatible with Visual Studio. I wanted to compare my own compressor with as wide a range as possible - but I also see slower decompression under GCC than Clang or Visual Studio (whether normal Visual Studio or with the Clang/C2 frontend).

ConorStokes commented 8 years ago

If you want I can arrange a zip of my Visual Studio'd lzbench for you tomorrow at some stage.

Thanks for your changes :).

TurtleSimos commented 8 years ago

A zip of Visual studio'd lzbench would be very handy, thanks. I created a platform header showing what I meant - you should have a pull request for lzsse2. Same approach would work for all of them - that version compiles under Visual Studio, Clang/C2, Clang/LLVM and GCC (MinGW or not).

ConorStokes commented 8 years ago

https://drive.google.com/file/d/0B9KN1m7I4sAFZXRYV3dlTU5qUUU/view?usp=sharing - Should have the zip, let me know when you've got it and I'll take it down.

TurtleSimos commented 8 years ago

Downloaded and built fine - thank you!

TurtleSimos commented 8 years ago

Just for interest, I verified the decompression speed difference.

Visual Studio: lzsse2 0.1 level 16 10 MB/s 2391 MB/s 56659916 54.04 niblz 0.1 level 7 0.49 MB/s 1249 MB/s 44419947 42.36 zstd v0.4.1 level 20 2.34 MB/s 459 MB/s 41880158 39.94 lzma 9.38 level 8 2.98 MB/s 65 MB/s 35890919 34.23

Gcc. lzsse2 0.1 level 16 10 MB/s 2150 MB/s 56659916 54.04 niblz 0.1 level 7 0.53 MB/s 1208 MB/s 44419947 42.36 zstd v0.5.1 level 20 3.81 MB/s 662 MB/s 39777107 37.93 lzma 9.38 level 8 2.97 MB/s 60 MB/s 35890919 34.23

The version of zstd is different, but the changes there are in the improved match finder for the compressor. Niblz is my own WIP. Lzsse2 certainly loses more under Gcc, but it is interesting how zstd favours gcc over Visual Studio.

ConorStokes commented 8 years ago

Those results match pretty much what I saw. I need to investigate the generated code, but I have a feeling Gcc's register allocation isn't performing as well.

Will be very interested to see niblz! That compression ratio vs decompression speed looks great.