Tiff is used for reading GeoTiff and - by extension - COGs. Those all require reading tiffs and me and friends think image/tiff is the place to nag for adding cool features to make our lives easier. Otherwise, we'd all be implementing our own decoders on top of preloaded images and stuff. Especially in the use-case of reading COGs, where concurrent, partial reads of a tiff are needed. This is mainly a discussion starter, but I think tiff::decoder::Image should be polished a bit more and then made public.

Summary

Make Image and other types public to allow for easily extending the Decoder. Also provide examples and implementation of an extended reader.

Motivation

GeoTiff landscape is currently quite fragmented and most implementations are stuck on being able to decode weird tiff/compression types. Those issues actually belong in this crate. I think this crate should either:

provide async/multithreaded support
have an extensible API to be able to implement said support.

then all geo-related functionality (and only that) can be put into georust/geotiff#7. Also libtiff has a multilayered api and based on this comment, I'd assume this library is trying to be somewhat analogous to libtiff. Therefore, exposing the API at multiple levels of abstraction should fit within this crate?

Currently remaining uglyness

The Image struct may need some polishing before being published as a pub struct. Below are some ramblinations on what could be improved before exposing the Image struct.

Currently, creating a Decoder from an Image is not hassle-free: byte_order is in the reader, bigtiff in the decoder (but that is also not a necessary field one the metadata is loaded) and so the implementation of ChunkReader uses all those. This info (byte_order and bigtiff) could be added to the Image struct.

Reading in the chunk offset/length fields in a large COG still takes quite some time, as shown here: This could be circumvented by having an enum like:

/// Enum for partially-loaded tags for chunk_offset and chunk_bytes
pub enum ChunkInfos {
 /// This tag has not been loaded yet, please read in the chunk you need
Uninitialized(Entry), // Entry field for ergonomic initialization
/// This tag has a minority of values read that are not necessarily close to each other
Sparse(Entry, HashMap<u32, u64>),
/// This tag has chunks read that form a sub-rectangle in the larger tiff
/// assumes a rectangle from topleft-botright, where x and y difference (or rather I and J according to [GeoTiff Spec](https://docs.ogc.org/is/19-008r4/19-008r4.html#_device_space_and_geotiff)) is calculated from TileAttributes
Rect{
  entry: Entry,
  topleft: u32,
  botright: u32,
  data: Vec<u64>
},
/// This tag is either entirely loaded, or has loaded enough data to be dense. `0` indicates a missing value.
Dense(Vec<u64>)
}

impl ChunkInfos {
fn get(chunk_index: u32) -> TiffResult<u64> {
  // logics
}

fn retrieve<R: Read + Seek>(chunk_index: u32, reader: R, byte_order: ByteOrder, bigtiff: bool) {//limits: Limits, 
  // more logics?
}

/// Not sure
fn retrieve_async<R: AsyncRead + AsyncSeek + Unpin>(//bla) {
  //more logics.await?
}
}

That would allow for partial reads of these tags. Reading and representation of these tags is directly embedded in this crate, so an extender could not implement this by themselves.

possibly that tiff::Image and image-rs::Image overlap in naming, callign for tiff::TiffImage.
I could probably think of more possible objections, but would need actual feedback first.
The name ChunkReader was co-developed with buzz from https://nurdspace.nl

image-rs / image-tiff

Expose Image and friends to allow for Multithreading implementations #246

Summary

Motivation

Currently remaining uglyness