Closed MartinquaXD closed 2 months ago
I think https://docs.rs/serde_json/1/serde_json/value/struct.RawValue.html is what you are looking for.
-pub struct SerializationCache(Arc<ArcSwap<DashMap<usize, Arc<str>>>>);
+pub struct SerializationCache(Arc<ArcSwap<DashMap<usize, Arc<serde_json::value::RawValue>>>>);
- pub fn get_cached_or_serialize(&self, dto: &Arc<impl Serialize>) -> Arc<str> {
+ pub fn get_cached_or_serialize(&self, dto: &Arc<impl Serialize>) -> Arc<serde_json::value::RawValue> {
- .or_insert_with(|| serde_json::to_string(&dto).unwrap().into())
+ .or_insert_with(|| serde_json::value::to_raw_value(&dto).unwrap().into())
impl<T: Serialize> Serialize for CachedSerialization<T> {
fn serialize<S: Serializer>(&self, serializer: S) -> Result<S::Ok, S::Error> {
self.cache
.get_cached_or_serialize(&self.value)
.serialize(serializer)
}
}
That's exactly what I needed. Thanks a lot! 🙇
I have a use case where I serialize complex data structures a lot. Let's say some of the sections of the struct to serialize are unique but some of them are identical for a lot of the time. Right now my program spends a lot of time serializing the same duplicate data over and over again. Is there a good way to serialize the identical data once and simply reuse the already serialized data to speed up the following serializations? Since I make significant use of
serde_derive
(and would like to keep it that way) my idea was a cache that maps pointers of objects to their serialized strings and have a wrapper around anArc
ed serializable struct that also implementsSerialize
but in a cached manner. Whenever the wrapper gets serialized it should first check a cache if the contained struct has already been serialized and if so just pipe the cached value into theSerializer
.For reference I was able to make something compile that has the API I would like but it makes use of
serde_transcode
. My understanding is thatserde_transcode
would deserialize the cached string and pipe it into the Serializer. I did not benchmark this approach yet but if I'm not mistaken this approach probably even adds overhead since deserializing a struct is surely faster than deserializing an equivalent JSON string, right?Is something like this possible with
serde
and if so what would be the best approach? Any suggestions or ideas are greatly appreciated. 🙏Reference code for the idea
```rust use { dashmap::DashMap, serde::Serialize, std::sync::Arc, arc_swap::ArcSwap, serde_json::Deserializer, }; #[derive(Default, Clone)] pub struct SerializationCache(Arc(&self, serializer: S) -> Result
where
S: serde::Serializer,
{
let serialized = self.cache.get_cached_or_serialize(&self.value);
// This probably doesn't improve performance after all.
// My understanding is that this first parses the cached string and then serializes it.
//
// What magic incantation do I have to put here to make serialization using the cached value optimal?
let mut deserializer = Deserializer::from_reader((*serialized).as_bytes());
serde_transcode::transcode(&mut deserializer, serializer)
}
}
#[cfg(test)]
mod tests {
use super::*;
#[derive(Serialize)]
struct Inner {
a: String,
b: usize
}
#[derive(Serialize)]
struct CachingOuter {
values: Vec>,
}
#[derive(Serialize)]
struct Outer {
values: Vec,
}
#[test]
fn cached_serialization() {
let cache = SerializationCache::default();
let vanilla = Outer {
values: vec![Inner {
a: "someValue".into(),
b: 123,
}],
};
let cached = CachingOuter {
values: vec![CachedSerialization::new(Inner {
a: "someValue".into(),
b: 123,
}.into(), cache.clone())],
};
let vanilla = serde_json::to_string(&vanilla).unwrap();
let cached = serde_json::to_string(&cached).unwrap();
assert_eq!(vanilla, cached);
}
}
```