Open karlcz opened 9 years ago
Very good point.
I feel like this might be asking a bit much out of a single format, although, if possible, would most certainly set FLIF apart from other formats, and quite far apart - a tiled, progressive image format with an open standard and implementation is one of the things that isn't quite there. If you add >2D images, and advanced metadata facilities, that'd make this format cover pretty much every image format use case out there, and cover it well.
Well if we do metadata as in https://github.com/jonsneyers/FLIF/issues/17 then we can just as well add tiles by allowing the container to contain multiple bare FLIF files (and metadata that gives their offsets)...
Just concatenating multiple FLIF files won't scale too well when rendering low resolution representation of the whole image - to render a thumbnail of an image with 100x100 tiles, you'd have to fetch data from 10000 different places...
Probably better to have one overall FLIF file with limited resolution, and then tiles with more detail. It would be somewhat redundant, but much more efficient to render.
Somewhat redundant, yes, but nonetheless, it would make high resolution data much more usable and compact in-memory, potentially including partial downloads.
How about interlacing the tiles with the overall "thumbnail" image? That way you get no redundant data, but can still access zoomed areas as well an overview image. You could perhaps take it even further to have multiple levels of tile interlacing for multiple levels of zooming ranges.
This, to me, sounds like a feature request to optimize responsive flif image viewing. The first chunk of data gives a very low resolution version of the image which could act as the base grid. The format doesn't need extra metadata. The resolution of the image can simply control how many pixels are in the base grid. Then, while viewing, include some extra flags so that the viewer can truncate the data by "tile". Maybe I am making a naive remark. I haven't read too deeply on how MANIAC works yet, but it seems plausible to me in a general sense. More important than anything, FLIF files need to remain lossless. Viewing FLIFs however seem to have the advantage of looking decent when they are not fully loaded, but that is the job of the viewer perhaps not the file format.
Tiled storage is about storage IO optimization with very large images, e.g. gigapixel and more. Today, imaging scientists are already working with 2D images with dimensions upwards of 50,000 x 50,000 pixels and 3D images with dimensions upwards of 1,000 x 1,000 x 1,000. In both cases, these are often multi-channel and higher bit depth including 32-bit float or 16-bit integer.
This requires support in the storage container format, as loading and decoding the entire non-tiled image is already a failure before the application has an opportunity to solve the problem. There are at least two standard use cases:
The first overlaps with progressive decoding goals, while the second works at full resolution and outputs new data with the same (or similar) tiled structure. Both share that the working set size for retrieval, decoding, and presentation is limited and the image space is traversed in a streaming fashion.
Whether these are in scope for FLIF goals or not, I don't know. If FLIF format doesn't address them, it probably won't impact scientific or medical imaging domains much, where a plethora of codecs and image formats are already evolving to address these issues. There are already many non-interoperable variants of TIFF and other container formats storing various forms of multi-dimensional, multi-channel, high precision, tiled and/or pyramidal image data with a variety of codecs.
Are meta-data and sub-files too much specificity? You could make the tiling implicit. A single mark in the header can say that the file is "tiled". The lowest resolution interlace encodes the whole image, the next interlace encodes as tiles with some specified sub-division of the image (8 rows/cols) and the next interlace increases the number of tiles again. This way, the number of tiles changes according to the interlace depth, subdividing greater resolution into smaller tiles -- the idea being that whether the image were zoomed in closely, or zoomed out far, only a roughly equivalent number tiles would have to be decoded.
The beginning of each interlace could implicitly include a list of offsets to the tiles that follow so that a mega sized image, memory mapped, would only need read the list and then jump to the tile(s) desired for that interlace-level.
Anyway, that's very pie-in-the-sky thinking, I have no practical knowledge of the realities of such requirements.
I have started implementing a FLIF specific tool for tiling large images: t-flif. It's a one week old prototype, so be gentle on it. I intend to improve it based on interest levels.
bounty incentive: https://freedomsponsors.org/issue/805/tiled-storage
Sorry for the offtopic comment, but $2800 for implementing a FLIF feature? That doesn't seem right. That site doesn't enforce the payments in any way, does it?
@asmagedfon I was surpised as well but looking at there profile they work for a space agency, satellite project. Multiple users have ask about using FLIF as a way to store hi-quality space pictures. So this seems important to them.
Hi @Asmageddon and @julianrichen , yes our use case of a petabyte-scale, wide-field astronomical survey using a ground-based telescope under construction in Chile (https://www.lsst.org/) is unique and requires specific technology enhancements that wouldn't likely be addressed without financial incentive. In particular, we have cost sensitivity regarding storage costs so we need a lossless format with high compression, as well as compatibility with mobile devices (for general public interaction) so we need an image format that supports progressive decoding, and finally we need an image format that supports animation since our survey is a time-series (i.e. we revisit the same patches of sky over and over). After reviewing various image formats, FLIF seems one of the most promising and we're funding initial development to investigate further. Also, our bounties are more like subcontract development than philanthropic crowdfunding so the dollar amounts are higher than many others because we research required skill sets, associated salary compensation levels, and estimated time for completion. As a final note, FreedomSponsors allows us to divide the bounty between multiple developers who contributed, which we'll determine by reviewing the pull requests, issue comment queue, and discussions with the project owner (e.g. @jonsneyers in this case). Hope that background helps!
@hrj Nice preliminary work with t-flif! Long-term, we may want to consider leveraging an existing tile viewer (like iipsrv / reference) to handle most of the interface/storage/web server complexities and focus our FLIF attention on supporting tile functionality within the image format itself (like TIFF and JPEG2000 currently do). (just a thought)
@benepo I was not aware of iipsrv
. It makes sense to leverage existing tools like that. So, here're some thoughts after a re-take.
There are couple of requirements that are related:
I think it makes sense to raise these requirements to a layer above the FLIF layer. That is, storing the data as a collection of FLIF images that are then aggregated together to form a larger whole. The advantages of this approach are:
Finally, when the new layer evolves and eventually becomes more stable, it can be rolled back into a future version of FLIF.
If this approach sounds good, I will start working on the details of the layer and corresponding library.
Hi @hrj, that approach definitely lowers the complexity and is similar to other tools, like Montage. However, does your approach require pre-generating all the tiles in advance? Since many areas of the night sky at various zoom levels will not be viewed by visitors, I was hoping to save storage costs by generating and caching the tiles on the fly.
However, does your approach require pre-generating all the tiles in advance?
No. I am planning to support absent tiles, and updates to tiles on the fly. When a tile is absent, the library will return a dummy image or a null result. And when tiles are added / updated, internal caches (in the library) will be invalidated.
Montage looks similar; although, to be clear, I don't plan to support projections. That can be left to a layer above the tiling layer. I believe iipsrv
, for example, would handle that.
Since many areas of the night sky at various zoom levels will not be viewed by visitors, I was hoping to save storage costs by generating and caching the tiles on the fly.
So you are planning to use the FLIF files as a cache to an ever larger store? I don't fully understand the need for two stores. But perhaps we can discuss that point offline, since it is a bit off-topic for FLIF.
So you are planning to use the FLIF files as a cache to an ever larger store?
No, my hope is that native browser support or polyfill will be sufficient to load the FLIF images as the tile image format shown to users. Is that naive from a performance standpoint?
Regarding projections, how would a layer above the tiling layer know how to organize and display the tiles if the tiles themselves don't encode tile awareness in the structure? ref: https://github.com/ruven/iipsrv/issues/96
@benepo There was some confusion in my understanding of your requirements. But after re-reading I am understanding them better.
No, my hope is that native browser support or polyfill will be sufficient to load the FLIF images as the tile image format shown to users. Is that naive from a performance standpoint?
Ah, so you want to transport FLIF images directly to browser! That will make things a lot more efficient on the storage and transport side. If no further processing of the raster data is required, we can serve truncated FLIFs directly, in which case transport should be on par (or faster) than JPEG2000 tiles served via iipsrv. Whether the browser polyfill can decode things fast enough, we will have to try it out.
I found a leaflet plugin for BPG format and that gives me confidence that I can build a quick demo using FLIF. We will then have a good idea of performance and storage costs.
Regarding projections, how would a layer above the tiling layer know how to organize and display the tiles if the tiles themselves don't encode tile awareness in the structure? ref: ruven/iipsrv#96
I believe the storage layer is itself agnostic of the projection. It is just storing a 2 dimensional raster of pre-projected data. However, it might encode the projection information as metadata, for the benefit of layers above it.
The raster data can be served and decoded directly in the viewer without further processing, as long as the same projection is used. However, if you need to support a different projection than the one stored, then there are two choices:
I will try to build an interactive demo to check my understanding of things.
If your full-res tiles are independent from the low-res tiles, you will be building something very similar to MRF - https://github.com/nasa-gibs/mrf , but inside a single file (rather than multiple related files).
A TIFF container can also be used for the same purpose, and GDAL has defined a structure that is suitable for interactive browsing of very large images https://trac.osgeo.org/gdal/wiki/CloudOptimizedGeoTIFF . TIFF supports various compression methods for its tiles, so putting FLIF in there shouldn't be difficult.
A drawback I see with these two formats is that they don't support a progressive scheme with residuals, where decoding a higher-res tile would require its lower-res counterparts (FLIF-hub/FLIF#362). And you'd have to think of a custom extension if you wanted to add quality layers (e.g. such as those in JPEG2000 or those mentioned by @jonsneyers in the aforementioned issue).
Has anyone considered using the new HEIF container format for this purpose? It's getting some hype now because of Apple, and using it with a HEVC payload (like BPG) seems a bad idea to me because of the patent mess, but as a container it might be useful. Not sure if there would be any significant difference between HEIF and TIFF in that respect, but in any case, let's avoid reinventing stuff that already exists.
Also I'm not sure if the HEIF container itself is safe w.r.t. patents, so we might want to investigate that.
About the container formats, thanks for the references @mkadunc and @jonsneyers. I will look into them later.
I have built an interactive demo for tiled storage using FLIF. The idea I am pursuing is described below.
I studied iipsrv
in more detail. They seem to store data in one of the following two configurations:
iipsrv
decodes slices as requested on the fly, reencodes them as JPG and serves them to the client.The first method needs more storage space, while the second method needs more CPU resources.
With FLIF, a hybrid approach is possible. Since FLIF allows truncation of files to generate low resolution slices, significant layers of the pyramid can be skipped from the storage layer. The client can make truncated queries (using HTTP range requests) and assemble the tiles on the fly. This reduces the pyramid overhead from 30% of base layer, to just 5% of the base layer.
In the demo, I took the original image and padded it to an integral power of 2 (to make things easier; not a hard requirement). In this case, it became an image of 16,384 x 16,384 pixels. This image was then converted into tiles of 512 x 512 pixels each, and encoded as interlaced FLIFs. We will call this 32x32 tile layer as the base layer.
Naively, the next layer would have 16x16 tiles (again of 512x512 pixels each). However, since FLIF allows truncation of the bitstream to yield a subsampled image, we can skip storing this layer, and generate it on the client side by combining 4 tiles from the base layer. The amount of data transferred with 4 truncated tiles is actually less than the size of a single base layer tile! I think this is because the final zoom-level (in FLIF terminology) of this image has maximum entropy (noise).
Note that the truncation can be done by simply adding a range
header to the HTTP request and most http servers support it off the shelf. No custom server is required, and no CPU overhead is incurred.
We could continue building this virtual pyramid, by truncating the base layer tiles further, but then the number of requests keeps growing. Hence, I have chosen to re-encode and store every alternate layer. And the top most layer is always stored, as it is the default zoom level and will be requested often.
Pros :
iipsrv
but slightly worse than JPEG2000 progressive tiles.Cons:
(Edit, fixed typo: 16384 pixels, not 16364)
Nice work @hrj! Definitely moving in the right direction.
Also, thanks @jonsneyers for suggesting the HEIF container. I wonder if packaging the FLIF codec in the HEIF container would allow for native support on Safari/iOS? Since HEIF supports multiple images in a single file, animated FLIF might be possible. Also, its derived images feature with image grids might be worth investigating from a tiling perspective.
@hrd could you use the same approach but with a single file. Here you would you use range requests to specify the bytes needed for a tile?
Also one other con, it's harder to cache range requests, but can be done.
@wassname Yes, it should be possible to use a single file, consisting of multiple FLIF files appended together, as long as there is an index of the contents maintained. The index can also be a part of the same file, stored at a known location.
In fact, the second half of my work would have been to design ways of storing the index + pyramidal tiles into one of the existing image containers mentioned in this thread, or if that's not possible, then defining a new container / directory structure.
However, I am going to be occupied with other work for the next few weeks. In the meanwhile, my current code is available in this repo if anyone would like to pick the baton.
Also for responsive web design: The image may not only be resized (and the screen pixel density taken into account), but also cropped when it is a background image that should offer flexibility for the ratio. In that case only the visible parts of the image should be downloaded, everything outside should be deferred until it may be needed (or the browser idle).
Is there any progress on this feature?
There has barely been any activity on FLIF since 2017 :disappointed::
If you want to address medical or other scientific domains, progress retrieval is not enough as images become very large.
You need a tiled storage layout so that progressive retrieval of cropped regions is efficient. You need to be able to quickly fetch and decode a high resolution portion of the image, without requiring the client to decode and store the rest of the image outside the cropped region of interest.
If you generalize to 3D or ND as in #10, this is further complicated by the region of interest being 3D or ND...