Closed dtolnay closed 6 years ago
One relatively easy workaround for serialization is coercing to a slice:
struct S {
#[serde(serialize_with = "<[_]>::serialize")]
arr: [u8; 256],
}
Deserialization is still annoying I think.
Hey folks, this feature is important to me because I'd like to be able to serialize a 512-bit hash (so, 64 bytes) and because the serde impls necessarily only go up to [u8; 32]
I cannot serialize a [u8; 64]
.
As workarounds I'm considering using [[u8; 32]; 2]
, GenericArray
or just lazily using a Box<[u8]>
. I'm piqued by the idea of the workaround shown above - @dtolnay did you ever find a deserialization workaround?
Would it be okay to add impls up to 64? Or is there some reason that hasn't been done?
In the meantime, perhaps we should add impls for the sizes that arrayvec
provides?
impl<T> Array for [T; 40]
impl<T> Array for [T; 48]
impl<T> Array for [T; 50]
impl<T> Array for [T; 56]
impl<T> Array for [T; 64]
impl<T> Array for [T; 72]
impl<T> Array for [T; 96]
impl<T> Array for [T; 100]
impl<T> Array for [T; 128]
impl<T> Array for [T; 160]
impl<T> Array for [T; 192]
impl<T> Array for [T; 200]
impl<T> Array for [T; 224]
impl<T> Array for [T; 256]
impl<T> Array for [T; 384]
impl<T> Array for [T; 512]
impl<T> Array for [T; 768]
impl<T> Array for [T; 1024]
impl<T> Array for [T; 2048]
impl<T> Array for [T; 4096]
impl<T> Array for [T; 8192]
impl<T> Array for [T; 16384]
impl<T> Array for [T; 32768]
impl<T> Array for [T; 65536]
@clarcharr I would prefer to stick with what the standard library does, which is 0 to 32 (inclusive).
Here is a workaround for deserializing.
#[macro_use]
extern crate serde_derive;
extern crate serde;
extern crate serde_json;
use std::fmt;
use std::marker::PhantomData;
use serde::ser::{Serialize, Serializer, SerializeTuple};
use serde::de::{Deserialize, Deserializer, Visitor, SeqAccess, Error};
trait BigArray<'de>: Sized {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where S: Serializer;
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where D: Deserializer<'de>;
}
macro_rules! big_array {
($($len:expr,)+) => {
$(
impl<'de, T> BigArray<'de> for [T; $len]
where T: Default + Copy + Serialize + Deserialize<'de>
{
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where S: Serializer
{
let mut seq = serializer.serialize_tuple(self.len())?;
for elem in &self[..] {
seq.serialize_element(elem)?;
}
seq.end()
}
fn deserialize<D>(deserializer: D) -> Result<[T; $len], D::Error>
where D: Deserializer<'de>
{
struct ArrayVisitor<T> {
element: PhantomData<T>,
}
impl<'de, T> Visitor<'de> for ArrayVisitor<T>
where T: Default + Copy + Deserialize<'de>
{
type Value = [T; $len];
fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
formatter.write_str(concat!("an array of length ", $len))
}
fn visit_seq<A>(self, mut seq: A) -> Result<[T; $len], A::Error>
where A: SeqAccess<'de>
{
let mut arr = [T::default(); $len];
for i in 0..$len {
arr[i] = seq.next_element()?
.ok_or_else(|| Error::invalid_length(i, &self))?;
}
Ok(arr)
}
}
let visitor = ArrayVisitor { element: PhantomData };
deserializer.deserialize_tuple($len, visitor)
}
}
)+
}
}
big_array! {
40, 48, 50, 56, 64, 72, 96, 100, 128, 160, 192, 200, 224, 256, 384, 512,
768, 1024, 2048, 4096, 8192, 16384, 32768, 65536,
}
#[derive(Serialize, Deserialize)]
struct S {
#[serde(with = "BigArray")]
arr: [u8; 64],
}
fn main() {
let s = S { arr: [1; 64] };
let j = serde_json::to_string(&s).unwrap();
println!("{}", j);
serde_json::from_str::<S>(&j).unwrap();
}
As long as you're not working with primes:
#[derive(Serialize, Deserialize, Debug)]
struct MyStruct {
data: [[u8; 32]; 16],
}
impl MyStruct {
fn data(&self) -> &[u8; 512] {
use std::mem::transmute;
unsafe { transmute(&self.data) }
}
}
This is a pretty neat workaround for when never expect a human to use the serialised version (e.g. bincode), as it creates a nested array. Added bonus: it also works for Debug
, PartialEq
, etc.
FWIW, I use this:
use serde::{Serialize, Serializer};
pub fn serialize_array<S, T>(array: &[T], serializer: S) -> Result<S::Ok, S::Error>
where S: Serializer, T: Serialize {
array.serialize(serializer)
}
#[macro_export]
macro_rules! serde_array { ($m:ident, $n:expr) => {
pub mod $m {
use std::{ptr, mem};
use serde::{Deserialize, Deserializer, de};
pub use $crate::serialize_array as serialize;
use super::*;
pub fn deserialize<'de, D, T>(deserializer: D) -> Result<[T; $n], D::Error>
where D: Deserializer<'de>, T: Deserialize<'de> + 'de {
let slice: Vec<T> = Deserialize::deserialize(deserializer)?;
if slice.len() != $n {
return Err(de::Error::custom("input slice has wrong length"));
}
unsafe {
let mut result: [T; $n] = mem::uninitialized();
for (src, dst) in slice.into_iter().zip(&mut result[..]) {
ptr::write(dst, src);
}
Ok(result)
}
}
}
}}
serde_array!(a64, 64);
serde_array!(a120, 120);
serde_array!(a128, 128);
serde_array!(a384, 384);
And then
struct Foo {
#[serde(with = "a128")]
bar: [f32; 128],
}
I do not plan to implement the workaround from heapsize_derive. I would prefer to see something like https://github.com/serde-rs/serde/issues/631#issuecomment-322677033 provided in a crate.
@dtolnay do I have your permission to publish this in a crate? You will be credited as co-author.
Yes go for it! Thanks.
Thanks! Published: https://github.com/est31/serde-big-array | https://crates.io/crates/serde-big-array
@dtolnay what do you think, does moving it into the serde-rs org make sense?
+1 on moving this into serde-rs.
Ability to serialize/de-serialize arrays larger than 32 should be a core feature. I'd use it for sure.
@dtolnay I do think we should consider changing the derive macro to support it instead. I'd rather have it work out of the box if possible.
I posted a request for implementation of a slightly different approach: https://github.com/dtolnay/request-for-implementation/issues/17.
To all the people in this thread hoping that const generics will resolve this: When trying to port serde to using const generics, I came across the problem that Serialize
and Deserialize
are implemented for arrays of size 0 on all types, not requiring Serialize
or Deserialize
on the type itself. See commit 6388019ad4840a1b5c515ffc353e6a4f2df3adc3 that introduced it. This is a major hurdle as of course serde isn't in the business of doing breaking changes any more. So we'll have to wait for a language improvement to allow sth. like impl <T, const N: usize> Serialize for [T; N] where N > 0
or specialization until it can be fixed in serde proper.
This prolongs the lifetime of the serde-big-array
crate until such a fix appears on the stable language, which can be well into next decade. Also I'm currently researching whether maybe at least serde-big-array
can avoid requiring you to specify array sizes: https://github.com/est31/serde-big-array/issues/3.
Servo does this to support big arrays in one of their proc macros:
https://github.com/servo/heapsize/blob/44e86d6d48a09c9cbc30a122bc8725b188d017b2/derive/lib.rs#L36-L41
Let's do the same but only if the size of the array exceeds our biggest builtin impl.
Thanks @nox.