Closed fitzgen closed 5 years ago
You could theoretically install a panic hook. Which is what I have done in one of my wasm projects in order to be able to find out what the panic was about.
My experience (although it is more -emscripten like) is that changing panic_fmt
to something that doesn't touches it's args and building with LTO will remove most of the Display
/Debug
machinery and data from segments
However, there was still problems: sometimes this strings wasn't removed despite the fact panic_fmt
was almost empty and "Arguments" parameter was dead.
As I can remember, I managed to solve this issue by running additional LLVM passes: deadargelim and some globals cleaning pass, over the final .ll
file.
As far as can tell that wasn't emscripten fault.
You could theoretically install a panic hook. Which is what I have done in one of my wasm projects in order to be able to find out what the panic was about.
Can one change the panic hook without resorting to unstable features?
@fitzgen yes, the panic hook API is stable: https://doc.rust-lang.org/stable/std/panic/fn.set_hook.html
Generally I don't object to an option that allows you to turn off all panic related things but there should be an option to keep the panic machinery, especially as we can't emit any source maps on the unknown target yet. I know this is a bit hard given that cargo can't recompile std :/.
Generally I don't object to an option that allows you to turn off all panic related things but there should be an option to keep the panic machinery, especially as we can't emit any source maps on the unknown target yet. I know this is a bit hard given that cargo can't recompile std :/.
Crate authors could optionally set the panic hook based on a cargo feature.
If setting a custom panic hook really does get rid of all this bloat, then we should document how to do this somewhere and then close this issue.
Oh, and devtools show the disassembly, and if you have the "name" section (-g
or debug = true
), also show function names, which should help for debugging.
@fitzgen I'm aware. They don't show the file name and line though.
If setting a custom panic hook really does get rid of all this bloat, then we should document how to do this somewhere and then close this issue.
There are two mechanisms:
panic_fmt
lang item. This does allow you to get rid of most of the panic formatting machinery (that's why it is used in an embedded framework as well) . It requires #[no_std]
and right now it also requires nightly.std::panic::set_hook
function. This one doesn't really optimize out anything. I guess because the hook needs to changed during runtime :).To make the first option work on stable, @japaric has written RFC 2070, but it still has to be implemented (tracking issue).
We can't really use this as we are no #[no_std]
application. Maybe we could make the compiler allow later #[panic_implementation]
invocations to override earlier ones somehow with a smart trick^TM. In order for this to work, LTO still needs to be able to optimize out the now dead code. Maybe this is not really possible. No idea. However, if xargo gets integrated into cargo, we can finally compile the entirety of libstd with panic=abort
which would be the best resolution to this entire question.
However, if xargo gets integrated into cargo, we can finally compile the entirety of libstd with panic=abort which would be the best resolution to this entire question.
I feel like having xargo
integrated into cargo
would solve a few of our other problems and points of contention too.
In the meantime I guess just going #[no_std]
is going to be the best bet in getting the tiniest code footprint? Well, that or manually sniping parts of the .wasm
file.
@fitzgen I'm still working on #12 but once it lands if that does work we should add it there as a resource. Even a whole section on reducing binary size would be good.
However, there was still problems: sometimes this strings wasn't removed despite the fact panic_fmt was almost empty and "Arguments" parameter was dead.
It seems that it is also issue for wasm32-unknown-unknown target. I've created an issue https://github.com/rust-lang/rust/issues/47526.
I'm so glad so many people are having these issues. I've been working on an RFC for a while (since December) on making a fmtless panic, as it is crazy that the core of the language (panic) requires one of the most bloated parts of core. I didn't really work on the RFC seriously as I thought I was the only one with this issue and it wouldn't go anywhere.
I'll start to prioritize it.
Currently the model is to have an opt-in fmt-less panic routine, perhaps just provides source file and line number (which for embedded targets is glorious debugging as is).
To me this is a pretty broad and general issue that's not going to have one solution but rather a lot of different pieces. There's a whole slew of reasons that panicking infrastructure / formatting infrastructure are difficult to remove today. I don't think it's worthwhile to blanket shoot for "remove everything at all costs" because panics are actually quite useful when debugging and such. I feel like it's always going to be true that if you're optimizing for code size (like you are on wasm) you'll be writing Rust differently than you would if you weren't optimizing for code size. Most of the idioms of Rust (not just panics) are not optimized for code size, and that's a debt you need to pay down when optimizing for that.
For me I think a clear first step towards making this issue less painful is to start looking and reorganizing the standard library where possible. There's plenty of locations in the standard library that contain panics when they shouldn't. In some cases LLVM just couldn't optimize it away or in others the code just needs to be restructured to remove the panic entirely. For example the formatting infrastructure should contain zero panics, but I doubt that's the case for today. Additionally I haven't figured out how to push on a Vec
in wasm without having a branch with a panic, that'd be a great thing to fix!
Overall I feel like we shouldn't address the symptom here (panic bloat in a binary) but rather the cause (branches towards having a panic). That'll be a long-lasting solution and benefit basically all targets that Rust has.
I slightly disagree with @alexcrichton here. While I always agree that we should minimize panics where they are not needed (there's no reason to not do this).
Further I agree that there should really be no way of removing panics. They are fundamental to the language and should be kept. However there needs to be a way of implementing a panic that just does something like core::intrinsics::abort()
and there should be almost no panic cost (this is up to the optimizer to delete arguments that are unused, which it does fine with fat LTO).
However fundamentally the internals of Rust have panics which are formatted (that are required and cannot be removed). This is great, for normal use it's nice to see that you indexed X bytes out of bounds or something. However a single use of core::fmt
is extremely expensive, even in the simplest case (see examples below, 1485 bytes increase in code size for a single panic with fat LTO and optimizations). Unless you can remove all panics, for each uniquely formatted type (str, padding, number, float, hex, etc) used in a panic there is going to be a tremendous amount of code introduced. For a moderate sized codebase that I have, this core::fmt
.code
section size increase is near 20 KiB, leading to me not being able to use the msg
format at all in panic_fmt
. I simply cannot touch it without going over code size requirements.
There has to be a solution beyond just reducing panic use, and I think it has to be generic. Not a case-by-case remove some panics here and some panics there.
I think there needs to be an optional panic routine that only takes static strings. This would be a nightmare for backwards compatibility as every panic would need to have a non-formatted alternative. However I think the solution is simpler. Just repr the actual format string. This is ugly, but if you are really that critical of code size then you have to make some sacrifices.
For example with a model like this for a bounds check panic you would get something like:
fn panic_static(msg: &'static str, file &'static str, line: u32, column: u32) -> ! {}
For something like a panic_bounds_check
this msg
string would be literally: index out of bounds: the len is {} but the index is {}
. It's not as useful as having numbers, however it tells you what the issue is (out of bounds), requires no changes of internal Rust code or existing panics, and can easily be deduped by the compiler to only exist once in .data
. Making multiple uses of this panic negligible in .data
size, and have no cost in .code
size as they won't bring in format routines.
Might not be the ideal panic message, but keep in mind without something like this I cannot get any message at all. I have to go entirely off of file and line (which honestly is pretty great for embedded work).
TL;DR: Unless you can avoid all non-&'static str
formats (that's never going to happen), you're going to pay at least about 2 KiB in core::fmt
code cost, and up to 30-40 KiB if you hit some more exotic format types. The cost is not due to the number of panics, but the number of unique panics. Once you've paniced with a numeric format once, it doesn't really cost anything to panic like that again (maybe 20-50 bytes for more .data
and 10-20 bytes for the actual branch for the panic).
For example this code (built with rustc -C lto -O -Z thinlto=off -g main.rs
):
#![no_std]
#![feature(lang_items, start, core_intrinsics)]
#[lang = "panic_fmt"]
#[no_mangle]
pub extern fn rust_begin_panic(msg: core::fmt::Arguments,
_file: &'static str,
_line: u32,
_column: u32) -> ! {
unsafe {
core::ptr::read_volatile(&msg);
core::intrinsics::abort();
}
}
#[lang = "eh_personality"]
#[no_mangle]
pub extern fn rust_eh_personality() {}
#[start]
#[no_mangle]
#[allow(non_snake_case)]
pub fn mainCRTStartup(_argc: isize, _argv: *const *const u8) -> isize
{
panic!("WOO");
}
Has the code distribution:
However by changing this code to have a potential panic due to a bounds check this code massively changes:
#![no_std]
#![feature(lang_items, start, core_intrinsics)]
#[lang = "panic_fmt"]
#[no_mangle]
pub extern fn rust_begin_panic(msg: core::fmt::Arguments,
_file: &'static str,
_line: u32,
_column: u32) -> ! {
unsafe {
core::ptr::read_volatile(&msg);
core::intrinsics::abort();
}
}
#[lang = "eh_personality"]
#[no_mangle]
pub extern fn rust_eh_personality() {}
#[start]
#[no_mangle]
#[allow(non_snake_case)]
pub fn mainCRTStartup(_argc: isize, _argv: *const *const u8) -> isize
{
let x = [1, 2, 3, 4];
x[_argc as usize]
}
Just by panicing with a bounds check added 1485 bytes of code (not data).
-B
I don't think it's worthwhile to blanket shoot for "remove everything at all costs" because panics are actually quite useful when debugging and such.
I think that one should be able to create .wasm
binaries completely stripped of panic and formatting infra, but also have a cargo feature that enables it for debugging and development.
Looks like this should help quite a bit too: https://github.com/rust-lang/rust/issues/54981#issuecomment-443369450 (silent aborts, unfortunately also requires Xargo but it's a good step)
@fitzgen i think we want to track this issue differently- i'm going to close (if i'm wrong tho please reopen and apologies!)
@ashleygwilliams, could you let us know if/when this is being tracked in a different issue?
I am subscribed to this thread because I’m interested in whatever progress is being made on this, but maybe there is another way to stay in the loop?
It makes perfect sense that in a debug build, you should get the full panicking infrastructure, but in a release build, you should be able to enable a feature that replaces all panics with a free call to abort().
Has such a feature been made available since the last update on this issue thread?
Has such a feature been made available since the last update on this issue thread?
Think this is what you were looking for https://github.com/rust-lang/rust/issues/54981#issuecomment-899917784
Yes, building std with panic_immediate_abort
is a solution that now solves this issue.
Even though panicking just translates into a
trap
without any diagnostic messages, we still include tons ofbeing_panic_fmt
etc type code. This has a huge code size footprint: ~75% of my code size afterwasm-gc
!I had to write
wasm-snip
pretty much just for removing panicking and formatting infrastructure. But that is just a stop-gap, not a solution: it is fragile and manual.