Cyan4973 / FiniteStateEntropy

New generation entropy codecs : Finite State Entropy and Huff0
BSD 2-Clause "Simplified" License
1.33k stars 143 forks source link

worst case compression wrongly evaluated #31

Closed Cyan4973 closed 10 years ago

Cyan4973 commented 10 years ago

From Artem Drach : https://groups.google.com/forum/#!topic/lz4c/SnFzpmyzXKU

chadaustin commented 10 years ago

Does this bug mean that FSE is not ready for use in a production environment? @Cyan4973 Do you think that 512 bytes should always be enough for the encoded bit stream, and if not, what it should be?

Cyan4973 commented 10 years ago

It's a bit early to call FSE "production ready", although that's the plan, and we are slowly getting there. My intention is to move away from "beta status" by the end of the year.

As a sidenote, FSE is getting some good testing through its usage within Zhuff, with several million files correctly compressed/decompressed so far. (see : https://groups.google.com/forum/#!topic/lz4c/OmA_uSnaO3Q)

Do you think that 512 bytes should always be enough for the encoded bit stream, and if not, what it should be?

If you mean "enough for the header", yes that's what I initially thought. Unfortunately, Artem tests prove that there is some kind of expansion I've not correctly estimated. That's something I intent to sort out and correct within the month.

As a sidenote, the issue is only visible when compressing "very" large blocks (such as quite a few megabytes), which is not recommended : in most circumstances, it's much better to compress smaller blocks (such as, typically, 64KB), to ensure proper entropy local adaptation, hence better overall compression ratio.

chadaustin commented 10 years ago

Thank you, Yann, for your prompt response! This sounds great. Depending on how my compression experiments go this week, we may end up using FSE in our compressed mesh file format. Our largest blocks are 256 KiB. I'll let you know if we encounter any problems!

Cyan4973 commented 10 years ago

Latest beta solves this issue. https://github.com/Cyan4973/FiniteStateEntropy/commit/16fc2e303749311c9c6c3103233704dce180bd2c

Max inflation has been evaluated at 0.45%, plus header. FSE_compressBound() is a bit of an over-evaluation currently, but it needs to be simple and fast, and it is safer for this function to over-evaluate a bit, rather than under-evaluate as it did up to now; resulting in memory errors.

chadaustin commented 10 years ago

Thank you!

Cyan4973 commented 10 years ago

Committed to master