drolbr / Overpass-API

A database engine to query the OpenStreetMap data.
http://overpass-api.de
GNU Affero General Public License v3.0
692 stars 90 forks source link

LZ4 compression: smaller threshold value for switching to uncompressed blocks #636

Open mmd-osm opened 2 years ago

mmd-osm commented 2 years ago

LZ4_Deflate::compress currently switches to storing data as uncompressed block, if the compressed data amounts to ~ 2 times the input block size.

However, it would make more sense to reduce this limit such that we're storing data as uncompressed blocks, if the lz4 compressed output data size is larger than the input data size.

Rationale: compression should have a net positive effect, and we should avoid it, if there's no real benefit.

https://github.com/mmd-osm/Overpass-API/blob/test759/src/template_db/lz4_wrapper.cc#L53-L58

drolbr commented 2 years ago

Thank you for the hint. I'm not sure how this works. The negative size is intended to mark an uncompressed block?

mmd-osm commented 2 years ago

Yes, exactly. Each block starts with a 32bit signed int value, which defines the size of the subsequent (lz4 or uncompressed) data.

By definition, if this 32bit value is negative, the original data was stored without compression. In that case, the 32bit value needs to be multiplied by -1 to determine the length of the uncompressed data.