biigle / core

:large_blue_circle: Application core of BIIGLE
https://biigle.de
GNU General Public License v3.0
12 stars 16 forks source link

Support large images with tiling #101

Closed mzur closed 6 years ago

mzur commented 6 years ago

We want to support exploration and annotation of large images. This can be huge tissue slide scans or stitched together mosaics of a transect. The usual way to do this is to extract tiles from the image at different zoom levels. Depending on the viewport and zoom only a subset of these tiles is loaded and displayed at any given time.

As we are using OpenLayers to display the images this functionality is already build in. But we have to evaluate how to implement this in the server side application. Do we want to extract the tiles from the image ourselves? Or do the users have to provide already correctly tiled images? How do we distinguish between regular images and tiled images (flag, new db table, etc.)?

Think of a strategy to implement this and evaluate the amount of work we would have to invest.

Thoughts:

mzur commented 6 years ago

Here is what I found:

libvips looks really great for working with huge images and images in some scientific format. It can also replace GD (or planned imagick #70) while being more memory efficient! It performs tremendously better than GD for extracting image patches. It may be hard to install on Solaris, though. Maybe on our new Linux machine? There is a libvips PHP extension and bindings.

I took a 43952x98748 px tissue slide scan as TIFF to experiment with vips. It's trivial to generate tiles from the TIFF that can be displayed by OpenLayers. Command to generate tiles in Zoomify format:

vips dzsave source.tif target_dir --layout zoomify

Minimal example to display the tiles with OpenLayers:

<!DOCTYPE html>
<html>
  <head>
    <meta charset="UTF-8">
    <title>Zoomify</title>
    <link rel="stylesheet" href="https://openlayers.org/en/v4.1.0/css/ol.css" type="text/css">
    <script src="https://openlayers.org/en/v4.1.0/build/ol.js"></script>
  </head>
  <body>
    <div id="map" class="map"></div>
    <script>
      var imgWidth = 43952;
      var imgHeight = 98748;

      var source = new ol.source.Zoomify({
        url: 'http://localhost:8000/target_dir/',
        size: [imgWidth, imgHeight],
        crossOrigin: 'anonymous'
      });
      var extent = [0, -imgHeight, imgWidth, 0];

      var map = new ol.Map({
        layers: [
          new ol.layer.Tile({
            source: source
          })
        ],
        target: 'map',
        view: new ol.View({
          // adjust zoom levels to those provided by the source
          resolutions: source.getTileGrid().getResolutions(),
          // constrain the center: center cannot be set outside this extent
          extent: extent
        })
      });
      map.getView().fit(extent);
    </script>
  </body>
</html>

The Zoomify format includes an ImageProperties.xml which looks like this:

<IMAGE_PROPERTIES WIDTH="43952" HEIGHT="98748" NUMTILES="88624" NUMIMAGES="1" VERSION="1.8" TILESIZE="256"/>

If an image file exceeds a certain size, Biigle can automatically generate Zoomify tiles for it. If the image file is requested through the /file API endpoint it can deliver the XML instead of the image. The annotation tool then switches it's rendering to "tile mode". Alternatively we could do this for any image no matter their size. This way we would have a single implementation that works all the time. But this inflates the required storage space as any image is effectively duplicated.

Comments to the thoughts from above:

mzur commented 6 years ago

Some thoughts from talking with the people at Geomar:

mzur commented 6 years ago

Preliminary plan of action:

~ 9 days of work (which currently are about 4 weeks)

mzur commented 6 years ago

I'm now using the docker branch as base for this. I configured the worker container to have vips available so I don't have to install it on my machine. I can run the tests with:

docker run --rm -t -v $(pwd):/app --entrypoint="" -w="/app" biigle/worker-dev php -d memory_limit=1G vendor/bin/phpunit

This is quite nice, actually, as the tests are run in the same environment as the app would eventually run. Instead of the biigle/worker-dev of my local Docker compose build, we could use a biigle/worker production image later.

mzur commented 6 years ago

I've now implemented a more generic image caching solution that will work even for very large images. This is used during thumbnail and annotation patch generation. It took half a day longer than planned but I think it's worth it.

mzur commented 6 years ago

Instead of serving the ImageProperties.xml I'm now storing the image dimensions to the image attrs JSON attribute if it is a tiled image. The /file endpoint will then serve a JSON containing dimensions and UUID of the image.