Closed peterlitvak closed 8 years ago
Thanks for the PR!
Your implementation open and close a new image reader for each read.
Have you tried to keep one stream open? I suspect it will be faster. Note that you will need to add synchronized
to the read methods.
Yes, tried that and also tried to have a pool of stream objects with a semaphore with number of tickets equals to the parallelism level. All produced (relatively) the same result performance-wise. The problem as I understand it, is in the fact that readers for compressed image types are all sequential and have to reset when the area to read is behind the current stream position. This is just my somewhat educated guess though.
Yeah, after some testing with jpg, what is taking the most time is the decompression.
As you said in the issue, one trick would be for the ScalablePyramidBuilder
to read as big of a region as possible (still fitting in RAM) and compute the pyramid on that part.
However it wouldn't be by stripe but rather by square regions to better match the current algorithm.
I will look into it and let you know how it goes.
I just created a new PR which includes your image reader + support for a cache: #6
When using the CLI, it now always use the direct image reader. In addition, you can set a cachePercentage
option which define how much of the input image can be cached. The default is 1 but setting it lower will save memory.
While writing this, I just realize that percentage is not the correct term. I should name it cache inputImageCacheRatio
or something.
Let me know how that work for you.
I'm actually now using the approach with un-compression to a BMP and then running the code with a modified direct reader (which has a pool of CPU+1 copies of the IIS and correspondent readers). So for my test image (21600x21600 ~= 1.5GB uncompressed) this completes in under a minute on 8 cores i7 with SSD (both conversion and pyramid gen) and I can set RAM for JVM as low as 512mb. Since disk space is cheaper than RAM or CPU, when it comes to the cloud deployment, I'm thinking on settling on this approach for now.
Just to add that using the pooled direct reader on compressed images has (practically) no performance improvement, which is not surprising.
Glad to see you got something working for you. I will merge the other PR once it is ready because it can be useful for others. Thanks for your contribution.
…ess instead of ImageBuffer