knurling-rs / defmt

Efficient, deferred formatting for logging on embedded systems
https://defmt.ferrous-systems.com/
Apache License 2.0
812 stars 74 forks source link

Add an impl Format for PanicInfo #856

Closed dflemstr closed 2 months ago

dflemstr commented 3 months ago

This impl is useful for panic handlers that use defmt to report panics, for example, panic_probe with the defmt feature.

The default core::fmt implementation for PanicInfo pulls in number formatting, padding support, dyn Debug/Display dependencies, etc. which can cause a lot of unnecessary code to be pulled in just to format PanicInfo.

jonathanpallant commented 3 months ago

The deprecation notice says:

this never returns anything useful

Do you have some example output of a panic handler using this formatter, where it did in fact return something useful?

Going forward, I wonder how we design a handler that works on both <= 1.80 and >= 1.81

jonathanpallant commented 3 months ago

I also asked this on the wg-embedded Zulip channel.

dflemstr commented 3 months ago

I tried to mirror the core code for the stable channel in this case: https://github.com/rust-lang/rust/blob/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/panic/panic_info.rs#L160-L178

From what I can see, the deprecation note was added in the nightly channel when payload() started to literally not return anything useful, ie the NoPayload struct: https://github.com/rust-lang/rust/blob/c3774be7411722d3695de2ab1da9a358d0d5c82c/library/core/src/panic/panic_info.rs#L107-L113

m-ou-se commented 3 months ago

when payload() started to literally not return anything useful

A core::panic::PanicHandler from a panic_handler has never returned anything useful, that behavior not a new thing. Trying to downcast a panic payload from a (no_std) panic_handler has simply always failed.

jonathanpallant commented 3 months ago

@m-ou-se is there a recommendation for how people should render a PanicInfo object without using the Debug impl? In a way that works across Rust versions?

m-ou-se commented 3 months ago

The only stable way today is to Display the PanicInfo itself. That will print the standard "panicked at {location}:\n{message}" message. That will still work in the future.

The only thing we're adding in the future is adding a .message() to get the {message} part separately. (We're not removing any functionality.)

dflemstr commented 3 months ago

To me the root reason for making this change is to remove a dependency on core::fmt for padding and number formatting, which increases binary size significantly for some embedded applications, e.g. I have size constraints to build a firmware that fits in 30 kiB, and number formatting/padding takes at least 3 kiB, which ends up being 10% of the binary.

@m-ou-se any ideas how one could make it possible to implement efficient defmt encoding of PanicInfo given these constraints? Since it is definitely not possible to add a defmt feature to core, I suppose some sort of public API for this would be a requirement?

m-ou-se commented 3 months ago

Yes, that's exactly why I was working on adding core::panic::PanicInfo::message(). ^^

That method will be stable in 1.81, most likely.

dflemstr commented 3 months ago

Haha, well that makes sense! Great! I suppose one has to just patiently wait then...

I guess the reason why I didn't make the connection immediately is that PanicMessage still encapsulates a core::fmt::Arguments which might still pull in the things I mentioned, but do you have any data on whether dead code elimination works better in this case? (Assuming that there are no panics in the binary that need to format integers of course)

Regarding this PR, I guess I should remove the downcast code, but should I also add a nightly feature flag that prints self.message()?

m-ou-se commented 3 months ago

I guess the reason why I didn't make the connection immediately is that PanicMessage still encapsulates a core::fmt::Arguments which might still pull in the things I mentioned

I'm confused, you want to be able to format the panic message without pulling in the code required to format that panic message?

That fmt::Arguments only pulls in whatever is used by the panic message. If that panic message is just a static string literal, it will not pull in any padding/formatting code etc.

But yes, if that panic message contains a formatted number or something (e.g. panic!("{}", my_i32)), that fmt::Arguments will contain a function pointer to <i32 as Display>::fmt.

jonathanpallant commented 3 months ago

Yes, because this is defmt, a deferred formatting mechanism where strings are interned into debug sections and replaced with small integers in the code. What then goes over the logging transport is the numeric ID of the interned string, and a simple byte-serialisation of any fundamental types used as arguments. Where the argument is not a fundamental type, we implement the defmt::Format trait, which explains how to print a type using strings (interned) and fundamental types (copied byte wise).

Thus the whole point of this crate is to bypass the expensive core::fmt machinery.

I guess if the panic message is panic!("x = {x}") we probably can't avoid it, but if the message is panic!("Oh no, not again") then we'd like to.

m-ou-se commented 3 months ago

I guess if the panic message is panic!("x = {x}") we probably can't avoid it, but if the message is panic!("Oh no, not again") then we'd like to.

Yeah, that's my point. Display'ing PanicMessage is fine then. fmt::Arguments for a static string doesn't pull in any formatting code.

For the panic!("x = {x}") case, that can be addressed by https://github.com/rust-lang/rust/pull/104525 (once I find some time to finish it ^^').

dflemstr commented 3 months ago

Hey, great pointers on the nuances between how static msgs are handled vs when they need integer interpolation.

To throw another idea on the stack: could defmt provide an implementation of core::fmt::Write to intercept the integer formatting? Then when you call Display::fmt on an Arguments object that uses integers in its format string, defmt could intercept it at that point to move over the integer formatting to the host too?

Or would that not be worth it, since we wouldn't get a monomorphized version of whatever struct impls core::fmt::Write, so all integer formatting calls would map to the same defmt symbol?

EDIT: I'll start on a separate PR to try to intercept core::fmt::Write

EDIT 2: This probably won't work since all the formatting is entirely encapsulated, ie the only way to format Arguments is to eventually end up at core::fmt::write (the function with a lowercase w), which doesn't allow for intercepting the placeholders and substitute a different formatting approach

dflemstr commented 2 months ago

@m-ou-se do you have any leads for how one would best offload the number formatting from core::fmt here? I'm happy to give this some more thoughts too but would appreciate a decent starting point for what might or might not be possible to land in the core lib, or what might or might not be possible to do with current stable/public APIs.

What we'd want to avoid is to have the number formatting code on the actual device; instead defmt will send the data to a host machine (e.g. in the case of a u16 it would literally send two bytes in binary form to the host machine) and perform the formatting there.

As mentioned above, I see that core::fmt::write takes a Write impl but the Write trait only has write_str, write_char, and write_fmt of course. That means we'd either need to find a way to intercept how specific types are formatted via some trait (with a different trait other than Debug or Display), or else get access to the parts of the core::fmt::Arguments struct where one could discriminate based on the type of each argument part.

We have the core implementation details, where core::fmt::rt::Placeholder and core::fmt::rt::Argument might have the data we'd be looking for, but of course those aren't available in public APIs.

I'd imagine wanting to be able to write a function at some point that does something like:

fn defmt_write_args(f: defmt::Formatter, args: core::fmt::Arguments) {
  for part in args.parts() {
    match part {
      Part::LitStr(s) => defmt::write!("{=str}", s),
      Part::Placeholder(p) => {
        if let Some(v) = p.value().downcast_ref::<u8>() {
          // TODO: if we imagine that `p` has some type similar to `core::fmt::rt::Placeholder`, then
          // params like p.position, p.align, p.precision, p.width could also be sent along to the host here
          // to be used to influence formatting.
          defmt::write!("{=u8}", v);
        } else if let Some(v) = p.value().downcast_ref::<u16>() {
          defmt::write!("{=u16}", v);
        }
        //...
      }
    }
  }
}
dflemstr commented 2 months ago

@Urhengulas thanks for the PR review! As per the above conversation I consider this PR to not be ready to merge since we are waiting to learn more from libcore/waiting for a stable release of PanicInfo::message.

I'm happy to make the changes that you requested once we have some clarity on the above points.

Urhengulas commented 2 months ago

@Urhengulas thanks for the PR review! As per the above conversation I consider this PR to not be ready to merge since we are waiting to learn more from libcore/waiting for a stable release of PanicInfo::message.

I'm happy to make the changes that you requested once we have some clarity on the above points.

Ah okay, I understood that the current version is not ideal but better than nothing. And then once .message gets stabilized we can improve it.

I will convert it to a draft PR. Please un-draft as soon as it is ready for merging again.

dflemstr commented 2 months ago

@Urhengulas ah sure, we can do it that way, to add .message() support at a later point.

For now I completely removed the .payload() code because as mentioned above the downcast will never succeed anyways.

I didn't find a way to update the snapshot tests because it isn't possible to construct a PanicInfo without actually panicking, and I didn't find a reliable way to use catch_unwind or similar. Please advise if you can think of another way of constructing PanicInfo.

As I think this is as good as things can get right now without waiting for the libcore changes, I will again unmark the PR as draft, as in I'm happy to get further review feedback :smile:

Urhengulas commented 2 months ago

@dflemstr said:

I didn't find a way to update the snapshot tests because it isn't possible to construct a PanicInfo without actually panicking, and I didn't find a reliable way to use catch_unwind or similar. Please advise if you can think of another way of constructing PanicInfo.

One (slightly involved) way I see, is to add a new snapshot test binary (probablu firmware/qemu/src/bin/panic_info.rs) and enable the panic_internals feature for it. Then you can use the PanicInfo::internal_constructor (https://doc.rust-lang.org/1.80.0/src/core/panic/panic_info.rs.html#42) to create a panic info and print it.

To add a new snapshot test, just create the binary, add it to the NIGHTLY_SNAPSHOT_TESTS https://github.com/knurling-rs/defmt/blob/5d467c2ef0b8e3d0c6894d5c0a4e643fc073a011/xtask/src/snapshot.rs#L30 and add the required feature (panic_internals) to https://github.com/knurling-rs/defmt/blob/5d467c2ef0b8e3d0c6894d5c0a4e643fc073a011/xtask/src/snapshot.rs#L79-L83

And then you can run it and create the output with cargo xtask test-snapshot panic_info --overwrite.

dflemstr commented 2 months ago

@Urhengulas thanks for the pointers! I had gone down the route of trying to enable the compiler-internal feature flag, but even when I tried to disable all the safeguards:

#![allow(internal_features)]
#![feature(allow_internal_unstable, panic_internals)]

#[allow_internal_unstable(panic_internals)]
fn test_panic_info() {
  //...
}

...it still wouldn't compile. However it was helpful that you explained how to add a separate snapshot test, which made it possible for now to have a separate test binary to let the test panic exactly once and succeed on panic.

dflemstr commented 2 months ago

For the record, I don't know why the semver check is failing, and I think it is unrelated to this PR.