jaemk / cached

Rust cache structures and easy function memoization
MIT License
1.57k stars 95 forks source link

"Error deserializing cached value" when *serializing* with DiskCache #216

Closed VorpalBlade closed 3 months ago

VorpalBlade commented 3 months ago

I'm attempting to use DiskCache in a command line program to not have to do slow uncompression of data from the system package manager every time I run the command. Unfortunately I run into some pretty strange errors when inserting in the disk cache:

2024-07-26T13:29:29.542867Z ERROR scan_fs: paketkoll_cache::from_archives: Cache set failed: Cache set failed: pkg=pkg-config cache_key=Debian:pkg-config:arm64:1.8.1-1:pkg-config#pkg-config:arm64

Caused by:
    0: Error deserializing cached value
    1: invalid type: string "48439ad7b7151d6bed0abea4d746332fa3891712f227dfbf720050479812c006", expected an array of length 32

The strange issue is that this doesn't happen consistently. The first few dozen inserts seem to always succeeded on any given run. The specific hex string varies. I don't understand why I'm getting a DiskCacheError::CacheDeserializationError when inserting in the cache? Shouldn't it be a serialization error instead?

The type I'm inserting is fairly complex. I'm using a DiskCache<CacheKey, Vec<FileEntryCache>> (not using the macro decorators, don't work for my use case). Where:

/// A file entry from the package database
#[derive(Debug, serde::Serialize, serde::Deserialize)]
pub struct FileEntryCache {
    /// Package this file belongs to
    pub path: PathBuf,
    pub properties: Properties,
    pub flags: FileFlags,
}

bitflags::bitflags! {
    /// Bitmask of flags for a file entry
    #[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, serde::Serialize, serde::Deserialize)]
    pub struct FileFlags : u16 {
        /// This file is considered a configuration file by the package manager
        const CONFIG = 0b0000_0000_0000_0001;
        /// It is OK if this file is missing (currently only relevant for systemd-tmpfiles)
        const OK_IF_MISSING = 0b0000_0000_0000_0010;
    }
}

/// File properties from the package database(s)
#[derive(Debug, Clone, PartialEq, Eq, Hash, serde::Serialize, serde::Deserialize)]
#[serde(rename_all = "snake_case")]
pub enum Properties {
    /// A regular file with just checksum info (as Debian gives us)
    RegularFileBasic(RegularFileBasic),
    /// A regular file with info that systemd-tmpfiles provides
    RegularFileSystemd(RegularFileSystemd),
    /// A regular file with all info (as Arch Linux has)
    RegularFile(RegularFile),
    Symlink(Symlink),
    Directory(Directory),
    Fifo(Fifo),
    DeviceNode(DeviceNode),
    /// This is some unknown thing that is not a file, symlink or directory
    /// (Currently generated in theory by Arch Linux backend, but no actual
    /// packages has this from what I can tell.)
    Special,
    /// An entry that shouldn't exist (being actively removed).
    /// (Currently only systemd-tmpfiles.)
    Removed,
    /// If the package management system doesn't give us enough info,
    /// all we know is that it should exist.
    Unknown,
    /// We don't know what it is, just what permissions it should have.
    /// (Currently only systemd-tmpfiles.)
    Permissions(Permissions),
}

/// A regular file with just checksum info (as Debian gives us)
#[derive(Debug, Clone, PartialEq, Eq, Hash, serde::Serialize, serde::Deserialize)]
pub struct RegularFileBasic {
    pub size: Option<u64>,
    pub checksum: Checksum,
}

/// A regular file with all info (as Arch Linux has)
#[derive(Debug, Clone, PartialEq, Eq, Hash, serde::Serialize, serde::Deserialize)]
pub struct RegularFile {
    pub mode: Mode,
    pub owner: Uid,
    pub group: Gid,
    pub size: u64,
    pub mtime: SystemTime,
    pub checksum: Checksum,
}

// And so on

I don't have a minimal reproducer at this time. What is the next step to debugging this?

VorpalBlade commented 3 months ago

Oh I think I found it (though it would have been useful to have some indication of what field or struct it was deserialising): I had a structure that had a serialize_with (but no matching deserializer, since before it had only been used to output json).

Still no clue why I was getting a deserialization error when serializing.