Esri / lerc

Limited Error Raster Compression
https://github.com/esri/lerc
Apache License 2.0
190 stars 56 forks source link

A new sample in LercTest #12

Closed andr7430 closed 8 years ago

andr7430 commented 8 years ago

It would be helpful if we can have a sample in LercTest to demonstrate encoding/decoding of an image on disk. This way, it would help users get a better grasp of MaxZError, and its direct effect on file size and decoded image quality.

The existing 2 samples are great in that it gives users an idea of how fast Lerc is, but leaves me wanting for more of a real life example.

jgravois commented 8 years ago

good idea @andr7430! would you be willing to submit a PR so we can discuss more?

@tmaurer3 I'm sure you have some thoughts on this. i'm looking forward to hearing them.

andr7430 commented 8 years ago

Of course, I am messing around with it now to see what I can do.

tmaurer3 commented 8 years ago

If you want the compressed Lerc byte array on disk, you simply write it there. You can use fwrite or ostream::write, see

http://www.cplusplus.com/reference/cstdio/fwrite/

http://www.cplusplus.com/reference/ostream/ostream/write/

Then you read it back in and decode it again. The size on disk is the same as in memory: the size of the byte array.

andr7430 commented 8 years ago

Hey @tmaurer3 so I am trying to encode an .tiff to a .lerc2 and then decode it back to a .tiff again. I am brand new to image compression, so hope you don't mind the rudimentary questions:

If I want to encode a .tiff, how should I take care of the header information for the tiff? Should I try to encode it to, or just keep it as is?

tmaurer3 commented 8 years ago

Tiff is a container format. It can have many tiles to cover large areas. The tiles can be compressed in different formats, such as png, jpeg, LZW, or (theoretically) Lerc. So Lerc is an image compression format for a single image tile, like png or jpeg. Tiff, however, can contain tiling, pyramids, spatial reference, and lots of other information.

So you wouldn’t convert a tiff to Lerc. You can convert a png or jpeg to Lerc, but that is not really the purpose of Lerc. The main purpose of Lerc is to provide a compression format for data that you cannot easily compress into png or jpeg. Because your data has higher bit depth, or you need precise control over the compression error, etc.

If you want to convert, let’s say, png to Lerc, you need a png reader or decoder. Decode the png image to the raw image arrays. Then you can Lerc encode those using this Lerc API.

jgravois commented 8 years ago

@tmaurer3 forgive my naïveté in this area, but are there any ubiquitous uncompressed raster formats that would make sense to convert from? perhaps bitmap?

tmaurer3 commented 8 years ago

You don’t need to look for other formats. Very often you have an uncompressed image, such as a floating point elevation tile, and wonder “how should I compress this”? You cannot use jpeg, esp if you don’t want arbitrarily large errors per pixel. The next option is lossless compression, deflate, zip, LZW. Which does not give much compression as you try to compress all the noise still contained in the data. That is why we developed Lerc.

jgravois commented 8 years ago

an uncompressed image, such as a floating point elevation tile.

this is what @andr7430 and i are suggesting be included in the repo with a test case so that developers can compare an input file to an output .lerc file written to disk.

tmaurer3 commented 8 years ago

John,

Disk IO is old school. Today’s computations happen in the cloud. Some 2D raster tile containing elevation or any other scientific 2D raster data gets computed on one machine. Lerc compress on the fly (probably using high precision setting), transmit to another cluster node, decompress again, use it there. Or Lerc compress using lower precision, transmit to a client for display, decompress directly in the browser and display. There is no file or disk IO anywhere.

The test program’s first sample shows this. The test program also has code to read in a Lerc file from disk. You first get the file size, either through the OS or using file seek as shown here. You read in the file using its real size as it could be truncated meaning you should not trust what is written in the header at this point. Once read in, you call the Lerc functions working in memory as shown. They do size checks and should fail gracefully if the file is not a valid Lerc file or got truncated.