jamesmunns / postcard

A no_std + serde compatible message library for Rust
Apache License 2.0
943 stars 91 forks source link

Serializing `NaiveDatetime` on AVR crashes #99

Open rihardsk opened 1 year ago

rihardsk commented 1 year ago

Whenever i try to serialize a chrono::NaiveDateTime or any struct that contains it internally, the program resets, if it's being run on an AVR microcontroller (ATmega328p specifically). It can be as simple as:

let entry = NaiveDateTime::MIN;
// This is where it crashes and the board resets and restarts running from the beginning
let output: postcard::Result<Vec<u8, 64>> = to_vec(&entry);

Serializing other stuff seems to work fine, e.g., this example code from the docs works as expected:

#[derive(Serialize, Deserialize, Debug, Eq, PartialEq)]
struct RefStruct<'a> {
    bytes: &'a [u8],
    str_s: &'a str,
}

let message = "hElLo";
let bytes = [0x01, 0x10, 0x02, 0x20];
let output: postcard::Result<Vec<u8, 11>> = to_vec(&RefStruct {
    bytes: &bytes,
    str_s: message,
});

Also, the same code runs without issue on x64, which makes me suspect a miscompilation on AVR but verifying that seems to be beyond my ability currently (tried looking at the AVR disassembly but there's simply too much going on there).

I'm building the code using the nightly-2023-03-24 toolchain.

Minimal reproducible example

main.rs

#![no_std]
#![no_main]

use chrono::NaiveDateTime;
use serde::{Deserialize, Serialize};

use heapless::Vec;
use panic_halt as _;
use postcard::to_vec;

#[derive(Serialize, Deserialize, Debug, Eq, PartialEq)]
struct RefStruct<'a> {
    bytes: &'a [u8],
    str_s: &'a str,
}

#[arduino_hal::entry]
fn main() -> ! {
    let dp = arduino_hal::Peripherals::take().unwrap();
    let pins = arduino_hal::pins!(dp);

    let mut serial = arduino_hal::default_serial!(dp, pins, 9600);

    ufmt::uwriteln!(&mut serial, "Serial set up, entering loop").unwrap();

    arduino_hal::delay_ms(1000);
    loop {
        let entry = NaiveDateTime::MIN;

        ufmt::uwriteln!(&mut serial, "Converting").unwrap();
        arduino_hal::delay_ms(1000);

        // For some reason this serialization code works ...
        //
        // let message = "hElLo";
        // let bytes = [0x01, 0x10, 0x02, 0x20];
        // let output: postcard::Result<Vec<u8, 11>> = to_vec(&RefStruct {
        //     bytes: &bytes,
        //     str_s: message,
        // });

        // ... but this doesn't
        //
        // NOTE: Not sure how much memory this takes but 64 bytes should be
        // plenty for serialization, i guess
        let output: postcard::Result<Vec<u8, 64>> = to_vec(&entry);

        ufmt::uwriteln!(&mut serial, "Writing").unwrap();
        arduino_hal::delay_ms(1000);
        match output {
            Ok(output) => {
                for byte in output {
                    serial.write_byte(byte);
                }
                serial.flush();
            }
            Err(_e) => ufmt::uwriteln!(&mut serial, "Serialization error").unwrap(),
        }

        arduino_hal::delay_ms(5000);
    }
}

Cargo.toml

[package]
name = "templog-arduino"
version = "0.1.0"
edition = "2021"
license = "MIT OR Apache-2.0"

[[bin]]
name = "templog-arduino"
test = false
bench = false

[dependencies]
panic-halt = "0.2.0"
ufmt = "0.1.0"
embedded-hal = "0.2.3"
serde = { version = "1.0.147", features = ["derive"], default-features = false }
chrono = { version = "0.4.23", features = ["serde"], default-features = false }
heapless = { git = "https://github.com/japaric/heapless" }

[dependencies.postcard]
# this refers to postcard rev 62c05473f76ead781aaa9d4a3e4057b86e543747
# with an updated version of `heapless` dependency
path = "../../postcard"

[dependencies.arduino-hal]
git = "https://github.com/rahix/avr-hal"
rev = "3b8f39fa2ec5e3359c7bedc33d982e75e8cc3700"
features = ["arduino-uno"]

# This doesn't seem to have any effect, don't know why
# [patch.crates-io]
# heapless = { git = 'https://github.com/japaric/heapless' }

# Configure the build for minimal size - AVRs have very little program memory
[profile.dev]
panic = "abort"
lto = true
opt-level = "s"

[profile.release]
panic = "abort"
codegen-units = 1
debug = true
lto = true
opt-level = "s"

Other things to note

I'm using postcard rev 62c05473f76ead781aaa9d4a3e4057b86e543747 with the following diff applied to it to work around https://github.com/jamesmunns/postcard/issues/82

diff --git a/Cargo.toml b/Cargo.toml
index 6498e55..d16e0dd 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -29,7 +29,8 @@ version = "0.2.24"
 optional = true

 [dependencies.heapless]
-version = "0.7.0"
+# version = "0.7.0"
+git = 'https://github.com/japaric/heapless'
 default-features = false
 features = ["serde"]
 optional = true

heapless is at rev 644653bf3b831c6bb4963be2de24804acf5e5001.

jamesmunns commented 1 year ago

I think this code: https://github.com/jamesmunns/postcard/blob/62c05473f76ead781aaa9d4a3e4057b86e543747/src/ser/serializer.rs#L321-L388 (and support for the collect_str method) were added due to issues with chrono (see #32).

If I had to guess, I would suspect an AVR miscompilation of dyn Trait items, which is used by the core::fmt::Write implementation?

rihardsk commented 1 year ago

What do you mean by

dyn Trait items, which is used by the core::fmt::Write implementation

I didn't notice any dyn Trait items when looking around.

I tried to probe this thing some more, but the only thing I discovered, is that NaiveDateTimes core::fmt::Display implementation seems to be working ok on its own. So that makes it somewhat more likely that something in postcard is causing this.

But I'm not sure how to proceed from here. Should I try to somehow inline collect_str implementation and pepper it with ufmt::writeln! calls to find out where exactly it breaks? Or could this be enough info as is, to open an issue on https://github.com/rust-lang/rust and see what they say about this?

jamesmunns commented 1 year ago

As far as I am aware, there are quite a few parts under the hood of the fmt::Write and write!() machinery that may end up using dyn Traits.

An interesting test might be something like:

use core::fmt::Write;

let mut buf: heapless::String<256> = heapless::String::new();
let data: NaiveDateTime = ...; // your test data here
write!(&mut buf, "{}", &data).unwrap();

And see if that works. This would use the same core::fmt::Write machinery I am using in postcard, without actually using postcard.

If that works on other targets (desktop, cortex-m), and NOT on AVR, I'd be pretty convinced it was something wrong with the AVR code generation. If that DOES work on AVR, then I'd certainly be confused!

rihardsk commented 1 year ago

When I said that

the only thing I discovered, is that NaiveDateTimes core::fmt::Display implementation seems to be working ok on its own

I meant that I already tried something similar. This is what i did:

        struct SerialWriter<'a, T> {
            serial: &'a mut T,
        }
        impl<'a, T> Write for SerialWriter<'a, T> where T: uWrite {
            fn write_str(&mut self, s: &str) -> core::result::Result<(), core::fmt::Error> {
                ufmt::uwrite!(&mut self.serial, "{}", s).map_err(|_| core::fmt::Error::default())?;
                Ok(())
            }
        }

        let mut serial_writer = SerialWriter { serial: &mut serial };

        let entry = NaiveDateTime::MIN;
        write!(&mut serial_writer, "{}", entry).unwrap();

        // This outputs:
        // -262144-01-01 00:00:00
        // over the serial connection

this works on AVR.

I tried out your code as well. It produces the same output when printed out with ufmt::uwriteln!(&mut serial, "{}", buf.as_str()).unwrap(). So that probably means that there's something else going on.

Btw, I searched for "AVR miscompilation" in the Rust repo and found this https://github.com/rust-lang/rust/issues/74743 which does indicate that the AVR backend has issues with dyn Trait stuff. But I guess the observations above indicate that this might not be relevant (unless postcard also uses some dyn Trait stuff internally). I didn't find anything else that seems relevant, though.

jamesmunns commented 1 year ago

ufmt explicitly doesn't use dynamic dispatch: https://docs.rs/ufmt/latest/ufmt/

No dynamic dispatch in generated code

It might be worth trying without using ufmt at all to see if you get the same issues. Since core::fmt::Write/writeln!() DOES use dyn dispatch, and ufmt::Write/ufmt::writeln!() DOESN'T, this is a pretty significant difference.

rihardsk commented 1 year ago

Here's how I invoked your code. It doesn't crash.

        use core::fmt::Write;
        let mut buf: heapless::String<256> = heapless::String::new();
        let data: NaiveDateTime = NaiveDateTime::MIN; // your test data here
        // This uses core::fmt::Write
        write!(&mut buf, "{}", &data).unwrap();

        // This uses ufmt::Write. Added for convenience to check the results
        ufmt::uwriteln!(&mut serial, "{}", buf.as_str()).unwrap();

AFAIK I'm exercising core::fmt::Write on the 5th line. The last line uses ufmt::Write but it's only for convenience and doesn't affect what's happening before that, commenting it out has no effect other than me not seeing the value of buf (still no crash without it). Is this what you're asking, or am I misunderstanding something?

Patryk27 commented 1 year ago

btw, the newest nightly (since https://github.com/rust-lang/rust/pull/114048) brought lots of fixes for the AVR backend - it might be worthwhile to recheck on it 🙂

rihardsk commented 1 year ago

Thanks, I'll have a look as soon as I get a chance.