cloudwego / sonic-rs

A fast Rust JSON library based on SIMD.
https://crates.io/crates/sonic-rs
Apache License 2.0
487 stars 31 forks source link

deserializing map key as type &[u8] shows different behaviors compared to serde_json #99

Closed zhongxinghong closed 3 months ago

zhongxinghong commented 3 months ago

basic info

deps:

# cargo.toml

serde = "1.0.204"
serde_json = "1.0.120"
sonic-rs = "0.3.8"

unit-test cases:

// tests/sonic_test.rs

#[derive(PartialEq, Debug)]
struct OneKV {
    key: Vec<u8>,
    value: i64,
}

impl<'de> serde::de::Deserialize<'de> for OneKV {
    fn deserialize<D>(d: D) -> Result<Self, D::Error>
    where
        D: serde::Deserializer<'de>,
    {
        struct Visitor {}

        impl<'de> serde::de::Visitor<'de> for Visitor {
            type Value = OneKV;

            fn expecting(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
                f.write_str("a map")
            }

            fn visit_map<A>(self, mut map: A) -> Result<Self::Value, A::Error>
            where
                A: serde::de::MapAccess<'de>,
            {
                let key = map.next_key::<&[u8]>()?.unwrap().to_owned();
                let value = map.next_value::<i64>()?;
                Ok(OneKV { key, value })
            }
        }
        d.deserialize_map(Visitor {})
    }
}

#[test]
fn valid_utf8() {
    let s = r#"{"ab": 1}"#;
    let kv1: OneKV = serde_json::from_str(s).unwrap();
    dbg!(&kv1);
    let kv2: OneKV = sonic_rs::from_str(s).unwrap();
    dbg!(&kv2);
    assert_eq!(kv1, kv2);
}

#[test]
fn invalid_utf8() {
    // {"U+10FFFF+1": 1}
    let b: &[u8] = &[b'{', b'"', 0xF4, 0x90, 0x80, 0x80, b'"', b':', b'1', b'}'];
    let kv1: OneKV = serde_json::from_slice(b).unwrap();
    dbg!(&kv1);
    let kv2: OneKV = sonic_rs::from_slice(b).unwrap();
    dbg!(&kv2);
    assert_eq!(kv1, kv2);
}

output:

$ RUSTFLAGS='-C target-cpu=native' cargo test --package rust_test --test sonic_test -- --nocapture --test-threads=1    
    Finished `test` profile [unoptimized + debuginfo] target(s) in 0.20s
     Running tests/sonic_test.rs (target/debug/deps/sonic_test-59f2a44356095237)

running 2 tests
test invalid_utf8 ... [tests/sonic_test.rs:49:5] &kv1 = OneKV {
    key: [
        244,
        144,
        128,
        128,
    ],
    value: 1,
}
thread 'invalid_utf8' panicked at tests/sonic_test.rs:50:46:
called `Result::unwrap()` on an `Err` value: Invalid UTF-8 characters in json at line 1 column 2

        {"����":1}
        ..^.......

note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
FAILED
test valid_utf8 ... [tests/sonic_test.rs:38:5] &kv1 = OneKV {
    key: [
        97,
        98,
    ],
    value: 1,
}
thread 'valid_utf8' panicked at tests/sonic_test.rs:39:44:
called `Result::unwrap()` on an `Err` value: Invalid JSON value at line 1 column 2

        {"ab": 1}
        ..^......

FAILED

failures:

failures:
    invalid_utf8
    valid_utf8

test result: FAILED. 0 passed; 2 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

error: test failed, to rerun pass `-p rust_test --test sonic_test`
liuq19 commented 3 months ago

Thanks very much, i will investidate it

liuq19 commented 3 months ago

the problem is fixed, pls update into 0.3.9