Serialization-Deserialization support for `DynamicImage`

sunipkm commented 4 months ago

On a client-server type application where the server can attach to a multitude of sources of image data, a DynamicImage is the ideal structure to encapsulate the image data. However, since DynamicImage does not support serialization-deserialization, sending the data from one process to the other is not trivial and requires conversion to custom, intermediate formats.

I have attached an implementation:

use crate::{
    DynamicImage, ImageMetadata, Gray16Image, GrayAlpha16Image,
    GrayAlphaImage, GrayImage, Rgb16Image, Rgb32FImage, RgbImage,
    Rgba16Image, Rgba32FImage, RgbaImage,
};
use serde::{
    de::{self, Visitor},
    ser, Deserialize, Deserializer, Serialize,
};
use std::{
    fmt,
    io::{Read, Write},
};
use base64::{engine::general_purpose::STANDARD_NO_PAD, Engine};

#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]
enum DynaColor {
    Luma8,
    LumaA8,
    Rgb8,
    Rgba8,
    Luma16,
    LumaA16,
    Rgb16,
    Rgba16,
    Rgb32F,
    Rgba32F,
}

struct SerialBuffer<'a> {
    data: &'a [u8],
    width: u32,
    height: u32,
    color: DynaColor,
    le: bool,
    meta: Option<&'a ImageMetadata>,
}

struct AllocSerialBuffer {
    data: Vec<u8>,
    width: u32,
    height: u32,
    color: DynaColor,
    le: bool,
    meta: Option<ImageMetadata>,
}

impl<'a> Serialize for SerialBuffer<'a> {
    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
    where
        S: serde::Serializer,
    {
        use flate2::write::ZlibEncoder;
        use serde::ser::SerializeStruct;
        let mut state = serializer.serialize_struct("SerialBuffer", 6)?;
        let mut encoder = ZlibEncoder::new(Vec::new(), flate2::Compression::default());
        let res = encoder.write_all(self.data);
        if res.is_err() {
            return Err(ser::Error::custom("Failed to compress data."));
        }
        let compressed_data = encoder.finish();
        if let Ok(compressed_data) = compressed_data {
            let compressed_data = compressed_data.as_slice();
            let compressed_data = STANDARD_NO_PAD.encode(compressed_data);
            state.serialize_field("data", &compressed_data)?;
            state.serialize_field("width", &self.width)?;
            state.serialize_field("height", &self.height)?;
            state.serialize_field("color", &self.color)?;
            state.serialize_field("le", &self.le)?;
            state.serialize_field("metadata", &self.meta)?;
            state.end()
        } else {
            Err(ser::Error::custom("Failed to compress data."))
        }
    }
}

impl<'de> Deserialize<'de> for DynamicImage {
    fn deserialize<D>(deserializer: D) -> Result<DynamicImage, D::Error>
    where
        D: Deserializer<'de>,
    {
        enum Field {
            Data,
            Width,
            Height,
            Color,
            Le,
            Metadata,
        }

        impl<'de> Deserialize<'de> for Field {
            fn deserialize<D>(deserializer: D) -> Result<Field, D::Error>
            where
                D: Deserializer<'de>,
            {
                struct FieldVisitor;

                impl<'de> Visitor<'de> for FieldVisitor {
                    type Value = Field;

                    fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
                        formatter.write_str("`secs` or `nanos`")
                    }

                    fn visit_str<E>(self, value: &str) -> Result<Field, E>
                    where
                        E: de::Error,
                    {
                        match value {
                            "data" => Ok(Field::Data),
                            "width" => Ok(Field::Width),
                            "height" => Ok(Field::Height),
                            "color" => Ok(Field::Color),
                            "le" => Ok(Field::Le),
                            "metadata" => Ok(Field::Metadata),
                            _ => Err(de::Error::unknown_field(value, FIELDS)),
                        }
                    }
                }

                deserializer.deserialize_identifier(FieldVisitor)
            }
        }

        struct SerialBufferVisitor;

        impl<'de> Visitor<'de> for SerialBufferVisitor {
            type Value = AllocSerialBuffer;

            fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
                formatter.write_str("struct SerialBuffer")
            }

            fn visit_seq<A>(self, mut seq: A) -> Result<Self::Value, A::Error>
            where
                A: de::SeqAccess<'de>,
            {
                let data: String = seq
                    .next_element()?
                    .ok_or_else(|| de::Error::invalid_length(0, &self))?;
                let width = seq
                    .next_element()?
                    .ok_or_else(|| de::Error::invalid_length(1, &self))?;
                let height = seq
                    .next_element()?
                    .ok_or_else(|| de::Error::invalid_length(2, &self))?;
                let color = seq
                    .next_element()?
                    .ok_or_else(|| de::Error::invalid_length(3, &self))?;
                let le = seq
                    .next_element()?
                    .ok_or_else(|| de::Error::invalid_length(4, &self))?;
                let meta = seq
                    .next_element()?
                    .ok_or_else(|| de::Error::invalid_length(5, &self))?;
                let data = STANDARD_NO_PAD.decode(data.as_bytes()).map_err(|e| {
                    de::Error::custom(format!("Failed to decode base64: {}", e))
                })?;
                Ok(AllocSerialBuffer {
                    data,
                    width,
                    height,
                    color,
                    le,
                    meta,
                })
            }

            fn visit_map<A>(self, mut map: A) -> Result<Self::Value, A::Error>
            where
                A: de::MapAccess<'de>,
            {
                let mut data: Option<String> = None;
                let mut width = None;
                let mut height = None;
                let mut color = None;
                let mut le = None;
                let mut meta = None;

                while let Some(key) = map.next_key()? {
                    match key {
                        Field::Data => {
                            if data.is_some() {
                                return Err(de::Error::duplicate_field("data"));
                            }
                            data = Some(map.next_value()?);
                        }
                        Field::Width => {
                            if width.is_some() {
                                return Err(de::Error::duplicate_field("width"));
                            }
                            width = Some(map.next_value()?);
                        }
                        Field::Height => {
                            if height.is_some() {
                                return Err(de::Error::duplicate_field("height"));
                            }
                            height = Some(map.next_value()?);
                        }
                        Field::Color => {
                            if color.is_some() {
                                return Err(de::Error::duplicate_field("color"));
                            }
                            color = Some(map.next_value()?);
                        }
                        Field::Le => {
                            if le.is_some() {
                                return Err(de::Error::duplicate_field("le"));
                            }
                            le = Some(map.next_value()?);
                        }
                        Field::Metadata => {
                            if meta.is_some() {
                                return Err(de::Error::duplicate_field("metadata"));
                            }
                            meta = Some(map.next_value()?);
                        }
                    }
                }

                let data = data.ok_or_else(|| de::Error::missing_field("data"))?;
                let data = STANDARD_NO_PAD.decode(data.as_bytes()).map_err(|e| {
                    de::Error::custom(format!("Failed to decode base64: {}", e))
                })?;
                let width = width.ok_or_else(|| de::Error::missing_field("width"))?;
                let height = height.ok_or_else(|| de::Error::missing_field("height"))?;
                let color = color.ok_or_else(|| de::Error::missing_field("color"))?;
                let le = le.ok_or_else(|| de::Error::missing_field("le"))?;
                let meta = meta.ok_or_else(|| de::Error::missing_field("metadata"))?;

                Ok(AllocSerialBuffer {
                    data,
                    width,
                    height,
                    color,
                    le,
                    meta,
                })
            }
        }

        const FIELDS: &[&str] = &["data", "width", "height", "color", "le", "metadata"];
        let res = deserializer.deserialize_struct("SerialBuffer", FIELDS, SerialBufferVisitor)?;

        use DynaColor::*;
        use DynamicImage::*;

        use flate2::read::ZlibDecoder;
        let mut decoder = ZlibDecoder::new(res.data.as_slice());
        let mut data = Vec::new();
        let dlen = decoder
            .read_to_end(&mut data)
            .map_err(|e| de::Error::custom(e.to_string()))?;
        let mut img = match res.color {
            Luma8 => {
                if dlen != res.width as usize * res.height as usize {
                    return Err(de::Error::custom("Data length does not match image size"));
                }
                let img = GrayImage::from_vec(res.width, res.height, data)
                    .ok_or(de::Error::custom("Could not convert"))?;
                ImageLuma8(img)
            }
            LumaA8 => {
                if dlen != res.width as usize * res.height as usize * 2 {
                    return Err(de::Error::custom("Data length does not match image size"));
                }
                let img = GrayAlphaImage::from_vec(res.width, res.height, data)
                    .ok_or(de::Error::custom("Could not convert"))?;
                ImageLumaA8(img)
            }
            Rgb8 => {
                if dlen != res.width as usize * res.height as usize * 3 {
                    return Err(de::Error::custom("Data length does not match image size"));
                }
                let img = RgbImage::from_raw(res.width, res.height, data)
                    .ok_or(de::Error::custom("Could not convert"))?;
                ImageRgb8(img)
            }
            Rgba8 => {
                if dlen != res.width as usize * res.height as usize * 4 {
                    return Err(de::Error::custom("Data length does not match image size"));
                }
                let img = RgbaImage::from_raw(res.width, res.height, data)
                    .ok_or(de::Error::custom("Could not convert"))?;
                ImageRgba8(img)
            }
            Luma16 => {
                if dlen != res.width as usize * res.height as usize * 2 {
                    return Err(de::Error::custom("Data length does not match image size"));
                }
                let data = bytemuck::cast_slice_mut::<u8, u16>(&mut data);
                if little_endian() != res.le {
                    data.iter_mut().for_each(|x| *x = x.swap_bytes());
                }
                let img = Gray16Image::from_vec(res.width, res.height, data.into())
                    .ok_or(de::Error::custom("Could not convert"))?;
                ImageLuma16(img)
            }
            LumaA16 => {
                if dlen != res.width as usize * res.height as usize * 4 {
                    return Err(de::Error::custom("Data length does not match image size"));
                }
                let data = bytemuck::cast_slice_mut::<u8, u16>(&mut data);
                if little_endian() != res.le {
                    data.iter_mut().for_each(|x| *x = x.swap_bytes());
                }
                let img = GrayAlpha16Image::from_vec(res.width, res.height, data.into())
                    .ok_or(de::Error::custom("Could not convert"))?;
                ImageLumaA16(img)
            }
            Rgb16 => {
                if dlen != res.width as usize * res.height as usize * 6 {
                    return Err(de::Error::custom("Data length does not match image size"));
                }
                let data = bytemuck::cast_slice_mut::<u8, u16>(&mut data);
                if little_endian() != res.le {
                    data.iter_mut().for_each(|x| *x = x.swap_bytes());
                }
                let img = Rgb16Image::from_raw(res.width, res.height, data.into())
                    .ok_or(de::Error::custom("Could not convert"))?;
                ImageRgb16(img)
            }
            Rgba16 => {
                if dlen != res.width as usize * res.height as usize * 8 {
                    return Err(de::Error::custom("Data length does not match image size"));
                }
                let data = bytemuck::cast_slice_mut::<u8, u16>(&mut data);
                if little_endian() != res.le {
                    data.iter_mut().for_each(|x| *x = x.swap_bytes());
                }
                let img = Rgba16Image::from_raw(res.width, res.height, data.into())
                    .ok_or(de::Error::custom("Could not convert"))?;
                ImageRgba16(img)
            }
            Rgb32F => {
                if dlen != res.width as usize * res.height as usize * 12 {
                    return Err(de::Error::custom("Data length does not match image size"));
                }
                let data = bytemuck::cast_slice_mut::<u8, u32>(&mut data);
                if little_endian() != res.le {

                    data.iter_mut().for_each(|x| *x = x.swap_bytes());
                }
                let data = bytemuck::cast_slice_mut::<u32, f32>(data);
                let img = Rgb32FImage::from_raw(res.width, res.height, data.into())
                    .ok_or(de::Error::custom("Could not convert"))?;
                ImageRgb32F(img)
            }
            Rgba32F => {
                if dlen != res.width as usize * res.height as usize * 16 {
                    return Err(de::Error::custom("Data length does not match image size"));
                }
                let data = bytemuck::cast_slice_mut::<u8, u32>(&mut data);
                if little_endian() != res.le {
                    data.iter_mut().for_each(|x| *x = x.swap_bytes());
                }
                let data = bytemuck::cast_slice_mut::<u32, f32>(data);
                let img = Rgba32FImage::from_raw(res.width, res.height, data.into())
                    .ok_or(de::Error::custom("Could not convert"))?;
                ImageRgba32F(img)
            }
        };
        if let Some(meta) = res.meta {
            img.set_metadata(meta);
        }
        Ok(img)
    }
}

impl From<&DynamicImage> for DynaColor {
    fn from(dynimage: &DynamicImage) -> Self {
        use DynamicImage::*;
        match dynimage {
            ImageLuma8(_) => DynaColor::Luma8,
            ImageLumaA8(_) => DynaColor::LumaA8,
            ImageRgb8(_) => DynaColor::Rgb8,
            ImageRgba8(_) => DynaColor::Rgba8,
            ImageLuma16(_) => DynaColor::Luma16,
            ImageLumaA16(_) => DynaColor::LumaA16,
            ImageRgb16(_) => DynaColor::Rgb16,
            ImageRgba16(_) => DynaColor::Rgba16,
            ImageRgb32F(_) => DynaColor::Rgb32F,
            ImageRgba32F(_) => DynaColor::Rgba32F,
        }
    }
}

impl<'a> From<&'a DynamicImage> for SerialBuffer<'a> {
    fn from(value: &'a DynamicImage) -> Self {
        let kind: DynaColor = value.into();
        let data = value.as_bytes();
        let meta = value.metadata();
        let width = value.width();
        let height = value.height();
        SerialBuffer {
            data,
            width,
            height,
            color: kind,
            le: little_endian(),
            meta,
        }
    }
}

impl Serialize for DynamicImage {
    /// Serialize the image to a buffer.
    /// The image data is interpreted as machine-endian, compressed
    /// using the Zlib algorithm, then encoded as a base64 string.
    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
    where
        S: serde::Serializer,
    {
        SerialBuffer::from(self).serialize(serializer)
    }
}

const fn little_endian() -> bool {
    u16::from_ne_bytes([1, 0]) == 1
}

fintelia commented 4 months ago

A substantial portion of this crate is devoted to encoding and decoding image file formats. I don't quite understand the benefit of inventing our own bespoke image format to serialize and deserialize DynamicImages when users can instead use one of the standard formats

sunipkm commented 4 months ago

The encoding/decoding methods are not ubiquitous and require further setup of the encoders and decoders. Serialization and deserialization are not meant for transporting image data across different devices and architectures, instead for ease of programming where, e.g., I want to capture image data and create a DynamicImage, serialize it, send it over IP to a client, and unpack it into a DynamicImage, and continue processing it.

fintelia commented 4 months ago

I suppose I should link this related PR which stalled out, but was trying to add uncompressed serialization/deserialization support.

If we do add serialization, I think it is very likely that people will use it to send images between different devices are architectures. But that's not necessarily a problem if we design it well. And there is a lot to be said for the convenience argument of not having to manually go through the process of encoding and decoding, particularly if the DynamicImage is within a larger structure.

Perhaps the thing to do would be to lean on our existing codecs? Have the wire format actually look like DynamicImage(Vec<u8>) and the blob of bytes be a PNG or TIFF encoded blob. That would likely get us better compression ratios and faster performance over using flate2 directly. And if we used standard image formats we wouldn't have to worry about versioning or mismatches between different image releases

sunipkm commented 4 months ago

Using PNG/TIFF is fine as long as the data can be transferred without loss (including channel information etc). However, in my understanding, serialization-deserialization is opaque (correct me if I am wrong). As long as care is taken to account for multi-byte-endianness (which I do), I don't see why the system has to marry an image format.

fintelia commented 4 months ago

Any way of serializing image data is essentially by definition an image format. The only question is whether it is a bespoke format that we've created ourselves or something standardized. Part of the benefit of picking a standard format is that we don't have to design one ourselves. But there's also benefits in terms of making sure that the format isn't accidentally changed between versions of this library (people may serialize with one version and deserialize with the next) and the higher level of testing and optimization we've already done for our existing formats.

sunipkm commented 4 months ago

I see how we can benefit by picking a standard format. As long as PNG (or some other, standard format) can accept all the variants of DynamicImage and transfer them in a completely lossless manner, then standard format is definitely the way to go. But, I think, none of the encoders support F32 images (correct me if I am wrong). Hence, the 'simple' serialization-deserialization format where the data is compressed and encoded, with ancillary information (width, height, color type, byte endianness).

fintelia commented 4 months ago

The TIFF format supports floating point. If the encoder doesn't currently allow it, it shouldn't be too much work to add

sunipkm commented 4 months ago

Will take a crack at it.

sunipkm commented 4 months ago

Adding 32F support was straightforward, but tiff encoder does not support gray images with alpha channel.

fintelia commented 4 months ago

The PNG encoder should support all the integer formats (and has generally been better optimized compared to the TIFF encoder). The magic bytes at the start of the file will indicate which format it is

sunipkm commented 4 months ago

PNG for integer images, TIFF for floating point, then.

image-rs / image

Serialization-Deserialization support for `DynamicImage` #2215