openpreserve / jpylyzer

JP2 (JPEG 2000 Part 1) validator and properties extractor. Jpylyzer was specifically created to check that a JP2 file really conforms to the format's specifications. Additionally jpylyzer is able to extract technical characteristics.
http://jpylyzer.openpreservation.org/
Other
69 stars 28 forks source link

Feature request: pixel data analytics #53

Closed boxerab closed 10 years ago

boxerab commented 10 years ago

1) PSNR for lossy compression with various compression settings 2) lossless: compare round-trip compress-decompress image with original image, to verify lossyless

bitsgalore commented 10 years ago

You could do all the above things already with ImageMagick's compare tool:

http://www.imagemagick.org/script/compare.php

And check the different metrics here:

http://www.imagemagick.org/script/command-line-options.php#metric

I know ImageMagick's JPEG 2000 support used to be problematic because of its dependancy on the buggy Jasper library, but they're now using OpenJPEG instead, so I expect this should have improved by now (haven't tried though).

In any case this functionality requires decoding of the image data, which is completely out of the scope of jpylyzer. So I don't see this happening.

boxerab commented 10 years ago

Thanks. As I understand it, jpylzer is intended to help jp2k consumers validate vendor codecs. If so, how do consumers currently validate pixel data?

I understand this is out of scope for jpylzer, but it would be useful to have a codec-agnostic tool that compared image quality between various codecs.

bitsgalore commented 10 years ago

The problem is that in order to do any analysis of pixel-level data you'll need to decompress the image data, and for this you need a codec. So there's simply no codec-agnostic way to do this (unless you would develop a brand new codec specifically for this purpose, but that would be an enormous task).

So for things like PSNR and losslessness I would use existing tools like ImageMagick / GraphicsMagick, since this is a job these tools do well (although yes, there'll be a codec dependency). This is something you can use on top of jpylyzer, see e.g. the link below for how we did this for a valuable collection of old TIFF images that we migrated to JP2:

http://wiki.opf-labs.org/pages/viewpage.action?pageId=36012209

However some recent work by the British Library suggests that not all codecs decode JPEG 2000 images in exactly the same way, which is something that may influence results. See:

http://www.scape-project.eu/wp-content/uploads/2013/11/iPres2013_Palmer_JPEG2000Codecs.pdf

Note that their analysis only covers lossy compression; for our Metamorfoze work (which uses lossless compression) I've never encountered any of these issues. There we simply compare the pixel values before and after the migration, and count the number of pixels that aren't identical (must be 0 for lossless compression, and this is also exactly what we got in all cases) .

boxerab commented 10 years ago

Thanks. Very interesting. I see how complex the situation is.

For lossless, I have the following concern: encode by one codec and decode by another codec may give a result that is different from original. Have you had a chance to investigate this scenario?

I have been working a fair bit with the openjpeg code, and here is an example that is concerning: for wavelet compression, the standard mandates how the codec should deal with pixels on the boundary: the standard indicates that mirroring should be used at the boundary, but openjpeg uses clamping: pixels outside of the boundary are clamped to boundary values.

So, if one codec uses mirroring and another uses clamping, then I think one would get a lossy encode-decode round trip for certain images.

bitsgalore commented 10 years ago

Yes, we did this. For the Metamorfoze migration we used the Aware codec to compress source TIFFs to lossless JP2. Then we converted those JP2s back to a temporary TIFF using Kakadu's _kduexpand tool. Finally we did a pixel-level comparison between each source TIFF and its corresponding temporary TIFF (latter is result of full compress-decompress cycle). We did this with GraphicsMagick using the following command line (from the top of my head):

gm compare -metric MAE source.tif fromjp2.tif

Result if pixels are identical:

Image Difference (MeanAbsoluteError):
       Normalized    Absolute
      ============  ==========
  Red: 0.0000000000        0.0
Green: 0.0000000000        0.0
 Blue: 0.0000000000        0.0
Total: 0.0000000000        0.0

Metrics explained here:

http://www.imagemagick.org/script/command-line-options.php#metric

Having a look at this again PAE might be a better metric; we originally used ImageMagick with AE but then we switched to GraphicsMagick which doesn't support that metric; just do a few tests yourself to check what works best in your case.

boxerab commented 10 years ago

Thanks. I will try this. Did you investigate the open source codecs, such as OpenJPEG and Jasper? It looks like Jasper is no longer developed, but OpenJPEG is set to become a reference implementation for the standard, and has had a lot of activity lately. The big issue with OpenJPEG, of course, is that it is very slow.

bitsgalore commented 10 years ago

Not much experience with those codecs. Jasper isn't actively developed, it has various issues & I would really recommend to stay away from it. The OpenJPEG situation seems to have improved a lot over the past few years, it's just that we haven't really used it for production work.

boxerab commented 10 years ago

Great, thanks so much for your help.

boxerab commented 10 years ago

By the way, I am actually developing a new jpeg 2000 codec. It will run on the GPU, and its going to be very fast. Also, open source. Should be in alpha by spring 2015.