cantaloupe-project / cantaloupe

High-performance dynamic image server in Java
https://cantaloupe-project.github.io/
Other
266 stars 107 forks source link

Java2dProcessor performance on tiled pyramidal (jpeg) TIFFs #296

Closed jbarth-ubhd closed 3 years ago

jbarth-ubhd commented 5 years ago

Dear reader, we have a tiled, pyramidal internally jpeg compressed tif of 9557×36619 px size.

When requesting region of x=2671 y=15179 w=6170 h=3471 at pct:31, cantaloupe is correctly getting the 50%-size image within the tiff, getting a 3085×1735 size image which has now to be scaled (bicubic) to 1913×1076px to match the »pct:31« above.

This takes 2.5 seconds according to log on an Intel Core i7-4790 CPU @ 3.60GHz. Imagemagick requires about 0.1 s for this (Q16, no hdri), and this snippet in java does take 0.21 s for downsampling:

    BufferedImage before=null;
    try { before=ImageIO.read(new File("test.bmp")); // GIF, PNG, JPEG, BMP, and WBMP
    } catch(IOException e) {
      System.out.println(e);
    }
    double fx=1913./3085;
    double fy=1076./1735;

    AffineTransform at = new AffineTransform();
    at.scale(fx, fy);
    AffineTransformOp scaleOp = new AffineTransformOp(at, AffineTransformOp.TYPE_BICUBIC);
    long start=System.nanoTime();
    BufferedImage after = scaleOp.filter(before, null);
    long stop=System.nanoTime();
    System.out.println((stop-start)*1e-9 + " s");
    try {
      System.out.println(after.getWidth());
      File outfile=new File("saved.png"); // JPEG, PNG, GIF, BMP and WBMP
      ImageIO.write(after, "png", outfile);
    } catch(IOException e) {
      System.out.println(e);
    }

What could be done to speed up Java2dProcessor?

ImageMagickProcessor takes about 97 s, because identify reads in the whole image, GraphicsMagickProcessor takes about 4 s... and JaiProcessor does use NearestNeighbour because of a Jai bug...

Kind regards, Jochen

adolski commented 5 years ago

Hi @jbarth-ubhd,

Can you share the TIFF file you are using?

jbarth-ubhd commented 5 years ago

vips-64.tif

generated with vips im_vips2tiff ~/Tafel_24.tif vips-64.tif:jpeg:80,tile:64x64,pyramid using vips-8.4.5-Wed Jan 18 12:36:44 UTC 2017

Link to our presentation: Piranesi1770 Trofeo... Tafel_24

I've chosen 64×64px wide tiles because

We recommend choosing TileWidth and TileLength such that the resulting tiles are about 4K to 32K bytes before compression. This seems to be a reasonable value for most applications and compression schemes

adolski commented 5 years ago

Thanks, I've got it and will try to take a look soon.

kaij commented 4 years ago

We see the same problem with pyramidal tiffs. Further analysis indicates, that this seems not directly related to the performance of scaling.

Using source JPG with TurboJpegProcessor gives the following output (bicubic scaling takes 273ms, overall processing <500ms):

DEBUG e.i.l.c.r.ImageRepresentation - Derivative cache not available; writing directly to the response
DEBUG e.i.l.c.p.Java2DUtil - scale(): scaled 2390x3466 image to 2000x2900 using a BiCubic filter in 273 msec
DEBUG e.i.l.c.r.ImageRepresentation - TurboJpegProcessor processed in 369 msec: 002_2000_000_0005.jpg_1:1_scale:2000,,bicubic_encode:jpg_JPEG_90_#FFFFFF_8

Using a pyramidal TIFF (converted from the jpeg) results in this (bicubic scaling takes >500ms, overall processing >1s):

DEBUG e.i.l.c.r.ImageRepresentation - Derivative cache not available; writing directly to the response
DEBUG e.i.l.c.p.c.TIFFImageReader - Acquiring region 0,0/2390x3466 from 2390x3466 image (256x256 tile size)
DEBUG e.i.l.c.p.Java2DUtil - scale(): scaled 2390x3466 image to 2000x2900 using a BiCubic filter in 552 msec
DEBUG e.i.l.c.p.c.JPEGImageWriter - ImageIO plugin preferences: com.sun.imageio.plugins.jpeg.JPEGImageWriter
DEBUG e.i.l.c.p.c.JPEGImageWriter - Using com.sun.imageio.plugins.jpeg.JPEGImageWriter
DEBUG e.i.l.c.p.c.JPEGImageWriter - Quality: 90; progressive: false
DEBUG e.i.l.c.r.ImageRepresentation - Java2dProcessor processed in 1013 msec: 002_2000_000_0005.tif_1:1_scale:2000,,bicubic_encode:jpg_JPEG_90_#FFFFFF_8

I then compared performance of the same pyramidal tiff to IIPImage Server (using bilinear "interpolated" scaling). Which results in overall processing of <500ms (Factor 2!)

It seems that without JAI, cantaloupe can't handle pyramidal tiffs anymore in an efficient way. I'm using a dockerized image based on OpenJDK-11 and Hotspot.

Any ideas on this?

adolski commented 4 years ago

I looked into this briefly a long time ago and my recollection is that it had something to do with the way the GeoSolutions TIFF Image I/O plugin creates BufferedImages, which ties into the idiosyncrasies of java.awt.image.BufferedImage, in particular its "types" and their suitability for Graphics2D processing. Basically the plugin tries to produce a BufferedImage with a ColorModel and SampleModel that are faithful to the source TIFF, but in many/most cases, this results in an image of "custom" type which Graphics2D is not optimized or able to process, requiring costly internal conversion.

JAI doesn't suffer from this problem because it uses a different processing engine.

A solution might involve either forking the plugin or moving to a different TIFF reading technique, but both of these will require time that I don't have currently.

kaij commented 4 years ago

Thanks @adolski for this very helpful insight! Now it makes sense to me that the scaling takes longer... I will have a look at it, but don't know if I have time for it either.

I hope pyramidal tiffs can soon be replaced with HTJ2K now that Grok supports it fully (see https://github.com/GrokImageCompression/grok/releases - I wonder whether I can maybe just use Cantaloupe's OpenJPEG integration with grk_decompress?)

kaij commented 4 years ago

Actually went ahead and tried the just released version 4.1 of Grok (custom compile, symlinked opj_compress, commented out the "quiet" flag in the OpenJpegProcessor.java in Cantaloupe).

Looks very promising on first impression - however I only compared on JPG sources for the moment (lossy HT "grk_compress -q 40 -M 32 -i 001_1974_000_0831.jpg -o 001_1974_000_0831.jp2"). Similar performance to TurboJpeg when using region display, a bit slower when displaying the full image. Decoding speed seems to depend heavily on compression parameters.

Even though it's writing out the intermediate bmp, it's still much faster than using pyramidal tiff (with scaling the problem described above). So maybe it's a longer-term option to integrate grok :-)

DEBUG e.i.l.c.p.OpenJpegProcessor - Invoking /usr/bin/opj_decompress -i /var/lib/cantaloupe/images/001_1974_000_0831.jp2 -o /tmp/OpenJpegProcessor-f6cd0d7b-96a1-4b57-be64-375addc4c133.bmp
DEBUG e.i.l.c.p.c.b.BMPImageReader - Using com.sun.imageio.plugins.bmp.BMPImageReader
DEBUG e.i.l.c.p.Java2DUtil - scale(): scaled 2735x3845 image to 1400x1968 using a BiCubic filter in 120 msec
DEBUG e.i.l.c.p.c.ImageWriterFacade - Writing with edu.illinois.library.cantaloupe.processor.codec.jpeg.TurboJPEGImageWriter
DEBUG e.i.l.c.r.ImageRepresentation - OpenJpegProcessor processed in 517 msec: 001_1974_000_0831.jp2_1:1_scalebypixels:1400,,bicubic_encode:jpg_JPEG_90_#FFFFFF_8_e95198c48292fa4c8fc0617e850c1f6d
adolski commented 4 years ago

@kaij I'm not very familiar with Grok, but if it works, then great.

Longer-term I would like to call directly into the OpenJPEG library rather than relying on opj_decompress, but if Grok is maintaining API compatibility with OpenJPEG, then there is a good chance that will work too.

jbarth-ubhd commented 3 years ago

Still the same problem (cantaloupe 4.1.5): We'll have an image, size 26315×19908 px, pyramidal tiff:

root@serv7:/srv/assets/assets/orig2/0/605000/605909# tiffinfo 1392039ec03494ade4b39060c02d64ecd1761951|grep -i width
  Image Width: 26315 Image Length: 19908
  Tile Width: 96 Tile Length: 96
  Image Width: 13157 Image Length: 9954
  Tile Width: 96 Tile Length: 96
  Image Width: 6578 Image Length: 4977
  Tile Width: 96 Tile Length: 96
  Image Width: 3289 Image Length: 2488
  Tile Width: 96 Tile Length: 96
  Image Width: 1644 Image Length: 1244
  Tile Width: 96 Tile Length: 96
  Image Width: 822 Image Length: 622
  Tile Width: 96 Tile Length: 96
  Image Width: 411 Image Length: 311
  Tile Width: 96 Tile Length: 96
  Image Width: 205 Image Length: 155
  Tile Width: 96 Tile Length: 96
  Image Width: 102 Image Length: 77
  Tile Width: 96 Tile Length: 96
  Image Width: 51 Image Length: 38
  Tile Width: 96 Tile Length: 96

Contains 1/1 … 1/512 image size. Now the requests (abbreviated + reformatted) from Firefox (Mirador viewer) with timings and image size fraction:

…/    0,    0,26315,19908/412,/… 2.0s 1/63.871359
…/    0,    0,24576,19908/768,/… 2.0s 1/32.000000
…/24576,    0, 1739,19908/ 55,/… 2.4s 1/31.618182
…/12288,    0,12288,12288/768,/… 2.7s 1/16.000000
…/    0,    0,12288,12288/768,/… 2.7s 1/16.000000
…/12288,12288,12288, 7620/768,/… 2.5s 1/16.000000
…/    0,12288,12288, 7620/768,/… 2.2s 1/16.000000
…/24576,    0, 1739,12288/109,/… 1.9s 1/15.954128
…/24576,12288, 1739, 7620/109,/… 1.7s 1/15.954128
…/12288, 6144, 6144, 6144/768,/… 2.2s 1/8.000000
…/ 6144, 6144, 6144, 6144/768,/… 2.3s 1/8.000000
…/12288,12288, 6144, 6144/768,/… 2.7s 1/8.000000
…/ 6144,12288, 6144, 6144/768,/… 2.5s 1/8.000000
…/12288,    0, 6144, 6144/768,/… 2.6s 1/8.000000
…/ 6144,    0, 6144, 6144/768,/… 2.5s 1/8.000000
… … …

Multiple seconds, even at 1/1.000000 zoom.

PS: changed to "wait" timings.

DiegoPino commented 3 years ago

@jbarth-ubhd can you share how much memory you have allocated for Cantaloupe? What caching is in place? Running this via jetty or in Tomcat? Which Java? Also version 4.1.5 is bit behind so I would compare at least with 4.1.7 if you are not ready for 5.0 (new configuration options/settings needed).

There can be many factors in place too, from a slow Filesystem to many other concurrent connections. I would love to test your source image with other processors in one our system to compare.

I'm not sure why (really not) but we had some TIFF speed issues (but not pyramidal ones) with Mirador v/s Open Sea dragon. It may be related to the tile sizes requested but since I have not explanation yet this is more like a comment without background.

jbarth-ubhd commented 3 years ago
# Ubuntu 18.04
/usr/lib/jvm/java-1.11.0-openjdk-amd64/bin/java -Dcantaloupe.config=/usr/local/diglit/iiif/cantaloupe/config/cantaloupe⸗
.properties -Xmx2g -jar /usr/local/cantaloupe/cantaloupe-4.1.8.war

Caching should not be the problem because requested zoom level exists in pyramidal tiff. Jetty. 4.1.8: the same timings.

Did enable "debug" logging:

… e.i.l.c.p.Java2DUtil [Java2DUtil.java:1004] scale(): scaled 1536x1536 image to 768x768 ⸗
using a BiCubic filter in 1228 msec
… e.i.l.c.p.Java2DUtil [Java2DUtil.java:1004] scale(): scaled 1536x1536 image to 768x768 ⸗
using a BiCubic filter in 1234 msec
… …

But 1536 / 768 = 2 — should be in pyramidal tiff.

jbarth-ubhd commented 3 years ago

What means this: … e.i.l.c.s.DelegateProxy [DelegateProxy.java:367] invokeUncached(): httpsource_resource_info() returned {"uri"=>"https://heidicon.ub.uni-heidelberg.de/eas/partitions/3/0/605000/605909/1392039ec03494ade4b39060c02d64ecd1761951/image/tiff/Rubens-tile94-deflate.tif"} for args: (none) in 2844 msec ?

PS: does this mean that returning the uri from DelegateProxy took 2844 msec?

jbarth-ubhd commented 3 years ago

Pyramidal tiff (for test purposes) is here: https://digi.ub.uni-heidelberg.de/diglitData/v/Rubens-tile94-deflate.tif

jbarth-ubhd commented 3 years ago

Oops... our delegate needs indeed multiple seconds.

jbarth-ubhd commented 3 years ago

besides the very slow response times of my delegate.rb, delegate_script.cache.enabled = true does not seem to work, the id is resolved multiple times. Is there something to consider in delegate.rb to make sure caching works?

PS: checked caching setting with endpoint.api.enabled = true.

PS2: why is caching removed from configuration in cantaloupe 5?

adolski commented 3 years ago

@jbarth-ubhd The delegate method invocation cache was added some time ago when the delegate methods relied on arguments instead of the context. Adding the context broke it and it has probably been broken throughout the 4.1 series. This was just something that fell off my radar.

In 5.0, the added richness of the context makes computing reliable cache keys difficult. I decided that I didn't want to support this feature anymore so I pulled it. If you need to employ caching, I recommend implementing your own cache.

jbarth-ubhd commented 3 years ago

Did implement some primitive caching, now timings are much better:

…/1139791:605909/    0,    0,26315,19908/412,/… 0.2s 1/63.871359 [200]
…/1139791:605909/    0,    0,24576,19908/768,/… 0.3s 1/32.000000 [200]
…/1139791:605909/24576,    0, 1739,19908/ 55,/… 0.2s 1/31.618182 [200]
…/1139791:605909/12288,    0,12288,12288/768,/… 0.6s 1/16.000000 [200]
…/1139791:605909/    0,    0,12288,12288/768,/… 0.7s 1/16.000000 [200]
…/1139791:605909/12288,12288,12288, 7620/768,/… 0.7s 1/16.000000 [200]
…/1139791:605909/    0,12288,12288, 7620/768,/… 0.5s 1/16.000000 [200]
…/1139791:605909/24576,    0, 1739,12288/109,/… 0.6s 1/15.954128 [200]
…/1139791:605909/24576,12288, 1739, 7620/109,/… 0.3s 1/15.954128 [200]
…/1139791:605909/ 6144, 6144, 6144, 6144/768,/… 0.7s 1/8.000000 [200]
…/1139791:605909/12288, 6144, 6144, 6144/768,/… 0.6s 1/8.000000 [200]
…/1139791:605909/ 6144,12288, 6144, 6144/768,/… 0.6s 1/8.000000 [200]
…/1139791:605909/12288,12288, 6144, 6144/768,/… 0.8s 1/8.000000 [200]
…/1139791:605909/    0, 6144, 6144, 6144/768,/… 0.6s 1/8.000000 [200]
…/1139791:605909/18432, 6144, 6144, 6144/768,/… 0.5s 1/8.000000 [200]
…/1139791:605909/    0,12288, 6144, 6144/768,/… 0.5s 1/8.000000 [200]
… … …
jbarth-ubhd commented 3 years ago

Cantaloupe still scales from 1536 to 768 (÷2), which is in pyramidal tiff(?)

But the requested size was 3072×3072... what has 1536 to do with this?

2021-04-14 09:21:46,713 INFO [qtp586617651-27] e.i.l.c.r.i.v.ImageResource [⸗
  AbstractResource.java:117] Handling GET /iiif/2/1139791:605909/0,3072,3072,3072⸗
  /768,/0/default.jpg
…
2021-04-14 09:21:47,939 DEBUG [qtp586617651-27] e.i.l.c.p.Java2DUtil [⸗
  Java2DUtil.java:1004] scale(): scaled 1536x1536 image to 768x768 using a ⸗
  BiCubic filter in 82 msec
…
2021-04-14 09:21:47,967 DEBUG [qtp586617651-27] e.i.l.c.r.ImageRepresentation [⸗
  ImageRepresentation.java:192] Java2dProcessor processed in 1197 msec: 1139791:⸗
  605909_1:1_cropbypixels:0,3072,3072,3072_scale:768,,bicubic_encode:⸗
  jpg_JPEG_80_interlace_#FFFFFF_8
2021-04-14 09:21:47,967 DEBUG [qtp586617651-27] e.i.l.c.r.HandlerServlet [⸗
  HandlerServlet.java:124] Responded to GET /iiif/2/1139791:605909/0,3072,3072,⸗
  3072/768,/0/default.jpg with HTTP 200 in 1255 msec
adolski commented 3 years ago

@jbarth-ubhd The dimensions of the individual pyramid levels should be logged in there somewhere, if you can set log.application.level = trace. If that doesn't show anything interesting, could you post the file so i can view it?

jbarth-ubhd commented 3 years ago

@adolski: https://github.com/cantaloupe-project/cantaloupe/issues/296#issuecomment-818571795

PS: instead of own cache implementation I'll use this tomorrow:

require 'lru_redux'
$cache=LruRedux::TTL::ThreadSafeCache.new(maxItems, maxTTL)
print $cache.getset('kkkey'){
  sleep 1 # something complicated
  1 # return value
}
jbarth-ubhd commented 3 years ago

@adolski: we'll change tile size from 96 to 256. Let's see …

jbarth-ubhd commented 3 years ago

How do I add additional ruby libraries?

Installed lru_redux with Gemfile + bundle install --path vendor/bundle.

Tried java -cp /usr/local/.../iiif/cantaloupe/config/vendor/bundle/ruby/2.5.0/gems/lru_redux-1.1.0/lib and java -Dorg.jruby.embed.class.path=/usr/local/.../iiif/cantaloupe/config/vendor/bundle/ruby/2.5.0/gems/lru_redux-1.1.0/lib - neither worked (not even searched for lru_redux.rb in this path, checked with strace -f open,openat,stat ...)

jbarth-ubhd commented 3 years ago

It seems that the slower requests come from from requests, where the "tile" of the requested region is not a integer number (!!!). Median 90 msec for "integer" requests, median 271 msec for non-"integer":

 62 msec; tile 128.63, 25.00;     …-1832] …cropbypixels:26112,12800,203,512_scale:203,,bicubic…
 64 msec; tile  12.00, 20.00;     …-1939] …cropbypixels:6144,10240,512,512_scale:512,,bicubic…
 64 msec; tile  12.00, 21.00;     …-1965] …cropbypixels:6144,10752,512,512_scale:512,,bicubic…
 64 msec; tile  44.00, 20.00;     …-1880] …cropbypixels:22528,10240,512,512_scale:512,,bicubic…
...
500 msec; tile  12.00,  5.00;     …-1880] …cropbypixels:24576,10240,1024,1024_scale:512,,bicubic…
501 msec; tile   0.00,  0.50; !!! …-1969] …cropbypixels:0,1024,1024,1024_scale:512,,bicubic…
502 msec; tile  10.50,  8.00; !!! …-1832] …cropbypixels:21504,16384,1024,1024_scale:512,,bicubic…
507 msec; tile   3.53,  1.25; !!! …-1831] …cropbypixels:24576,10240,1739,2048_scale:435,,bicubic…
516 msec; tile   1.00,  1.55; !!! …-1831] …cropbypixels:2048,2048,1024,662_scale:512,,bicubic…
518 msec; tile   0.50,  0.75; !!! …-1965] …cropbypixels:4096,6144,2048,2048_scale:512,,bicubic…
524 msec; tile   1.00,  0.50; !!! …-1936] …cropbypixels:2048,1024,1024,1024_scale:512,,bicubic…
527 msec; tile   0.33,  0.00; !!! …-1937] …cropbypixels:2048,0,1552,2048_scale:388,,bicubic…
529 msec; tile   1.00,  0.00;     …-1935] …cropbypixels:2048,0,1024,1024_scale:512,,bicubic…
537 msec; tile  10.50,  7.50; !!! …-1811] …cropbypixels:21504,15360,1024,1024_scale:512,,bicubic…
538 msec; tile   2.50,  4.00; !!! …-1811] …cropbypixels:5120,8192,1024,1024_scale:512,,bicubic…
541 msec; tile   0.50,  1.55; !!! …-1965] …cropbypixels:1024,2048,1024,662_scale:512,,bicubic…
546 msec; tile   0.75,  0.75; !!! …-1969] …cropbypixels:6144,6144,2048,2048_scale:512,,bicubic…
562 msec; tile  11.00,  8.00;     …-1935] …cropbypixels:22528,16384,1024,1024_scale:512,,bicubic…
565 msec; tile  12.00,  5.50; !!! …-1969] …cropbypixels:24576,11264,1024,1024_scale:512,,bicubic…
jbarth-ubhd commented 3 years ago

Let's take the last request on the list above:

2021-04-16 08:23:21,562 INFO [qtp586617651-1969] e.i.l.c.r.i.v.ImageResource [AbstractResource.java:117] Handling GET /⸗
iiif/2/1139791:708192/24576,11264,1024,1024/512,/0/default.jpg
...
2021-04-16 08:23:21,581 DEBUG [qtp586617651-1969] e.i.l.c.s.HttpSource [HTTPStreamFactory.java:84] newSeekableStream():⸗
 using 524288-byte chunks
2021-04-16 08:23:21,582 DEBUG [qtp586617651-1969] e.i.l.c.p.c.TIFFImageReader [AbstractIIOImageReader.java:93] Using it⸗
.geosolutions.imageioimpl.plugins.tiff.TIFFImageReader
2021-04-16 08:23:21,582 DEBUG [qtp586617651-1969] e.i.l.c.p.c.TIFFImageReader [AbstractIIOImageReader.java:106] Ignorin⸗
g metadata? true
2021-04-16 08:23:21,634 DEBUG [qtp586617651-1969] e.i.l.c.p.c.TIFFImageReader [AbstractIIOImageReader.java:506] Acquiri⸗
ng region 24576,11264/1024x1024 from 26315x19908 image (256x256 tile size)
2021-04-16 08:23:22,130 DEBUG [qtp586617651-1969] e.i.l.c.p.Java2DUtil [Java2DUtil.java:1004] scale(): scaled 1024x1024⸗
 image to 512x512 using a BiCubic filter in 233 msec

Why does cantaloupe scale? The image in the second tiff directory has the wanted scale, and the request perfectly matches the 256×256 tiles in this image.

jbarth-ubhd commented 3 years ago

I'll open the last question in a new case.