Stabilize `intrinsics::abort` in `core`

japaric commented 6 years ago

I would like to see intrinsics::abort stabilized in some form as it's useful in some cases to have panicking behavior (and alloc_error) = abort (minimizes code size and the abort exception / signal can be handled elsewhere or in a upper abstraction layer).

As intrinsics::abort lowers to a trap instruction on most architectures (*) I propose we add a safe wrapper around it in core::arch::$ARCH for the architectures where it does.

(*) On most but not on all. For example, On MSP430 it lowers to a call to the abort function which results in a linker error unless that symbol is defined somewhere in the dependency graph.

(Related: There's std::process::abort which claims to be a safe wrapper around intrinsics::abort but if you at its implementation it is, in fact, not just a wrapper around intrinsics::abort but something completely different.)

@rust-lang/libs what would be needed to move this forward? An full RFC, or just a PR + FCP?

cc @Amanieu @glandium

steveklabnik commented 6 years ago

I've also found it strange that this doesn't exist in core; big 👍 from me.

glandium commented 6 years ago

I'm all for it, but I'm wary about core::arch. AIUI, there is, purposefully, no platform-specific code in libcore at the moment. A specific module for that seems like opening the door to more.

japaric commented 6 years ago

@glandium core::arch already exists on stable. There are architecture specific APIs in it for SIMD instructions.

Amanieu commented 6 years ago

I'm not a big fan of the way intrinsics::abort is used at the moment: it just crashes the program with an obscure error message (Illegal instruction). IMO the proper way to handle this is to use something like #[abort_implementation] to allow customizing the behavior of aborts per platform.

Then again, this is a lot of work for something that is very rarely used (compared to normal panics). It might not be worth the trouble.

alexcrichton commented 6 years ago

Sorry for the delay in getting to this @japaric, but FWIW I'd personally be ok with a PR to add this to an unstable (eventually to-be-stable) location in libstd and then later have a stabilization PR. This doesn't seem like it's RFC-grade to me

japaric commented 6 years ago

@Amanieu this proposal is about stabilizing a way to generate a trap / abort instruction. Your comment seems to be about a user customizable process::abort that's available in core.

@alexcrichton sounds good! The main question here is where in core to put this. IMO, it makes sense to put this in coresimd / core::arch::$ARCH and name it trap, instead of abort, as this is a stable wrapper for the llvm.trap intrinsic that only produces a trap instruction for some architectures (e.g. it doesn't do that for MSP430). Could I get yours and @gnzlbg's opinions on that location?

gnzlbg commented 6 years ago

@japaric I'd put it in core::intrinsics first since the llvm intrinsic it forwards to is "portable", as in, it can be called on all architectures, although it might not do what you expect on all of them. We can just point people here to the llvm docs.

I propose we add a safe wrapper around it in core::arch::$ARCH for the architectures where it does.

So suppose we would put it in core::intrinsics and stabilize it. Would we still need to add those safe wrappers to core::arch or could one be built in a stable third-party crate ? Even if it can be built in a third party crate, it might still make sense to expose a safe wrapper to it somewhere in core, but we don't have to solve that problem right now, and maybe with some experience about how to write that in a crate and use it, we can put a better solution in core later.

japaric commented 6 years ago

I'd put it in core::intrinsics first since the llvm intrinsic it forwards to is "portable"

It's already in intrinsics: core::intrinsics::abort but there's no precedent for stabilizing any API in there ... Hold on, the core::intrinsics module is unstable but e.g. core::intrinsics::transmute and a few other intrinsics are stable -- that's weird: core::intrinsics::transmute(..) is stable but the more idiomatic use core::intrinsics; intrinsics::transmute(..) requires a feature gate.

Yeah, I'm not sure stuff in core::intrinsics is supposed to stabilized. core::intrinsics::transmute may only be stable as a byproduct of stabilizing it as a re-export in core::mem.

I propose we add a safe wrapper around it

This was a typo, i think. I probably meant a stable wrapper around the llvm.trap intrinsic. Whether it's safe or not is not too important (though I don't see a reason why abort should be unsafe); the important part is that it becomes a stable API.

So suppose we would put it in core::intrinsics and stabilize it.

If that's OK then, yeah, we don't need anything else in core::arch, but like I said above I don't think we are supposed to tell people to use core::intrinsics on stable. My proposal is assuming that core::intrinsics is not the right place to stabilize this API in.

japaric commented 6 years ago

And as a data point: for these architectures llvm.trap lowers to an instruction: ARM{,BE}, AArch64, i686, mips{,64,64el}, powerpc{,64,64le}, nvptx64, sparc64, x86_64 and wasm32.

For MSP430 and RISCV32 llvm.trap lowers to a call to the abort function, which is not defined in core or compiler_builtins so it can cause a linker error.

AFAIK, there's no precedent for a stable API in core that can produce linker errors when compiled for some architectures. This is way I'd prefer not to make a stable llvm.trap available to all targets.

gnzlbg commented 6 years ago

Do clang and GCC export these somehow? I've used a __builtin_trap intrinsic in C and C++, but in some platforms that actually lowered to 1 / 0...

We could re-export it in core::arch for all architectures in which it does something meaningful too. But @alexcrichton maybe the libs team should discuss what's the plan for core::intrinsics, whether its for unstable stuff only, or whether parts of it could be stabilized. If not, where could we put this?

cc @eddyb @sunfishcode does Cretonne have a trap intrinsic? If so, does it do the same thing as llvm.trap? Also, where would you put this?

SimonSapin commented 6 years ago

Libs team hat on: I’d rather not stabilize the intrinsics module. I view it as an implementation detail that (ideally) could be removed entirely.

In the past, when stabilizing an intrinsic, we’ve added a re-export or wrapper in another (stable) module.

That core::intrinsics::transmute(..) works on stable is a bug: https://github.com/rust-lang/rust/issues/15702. We clarified in https://github.com/rust-lang/rfcs/pull/2405 that the stability promise does not extend to features that are accidentally available on the stable channel.

nagisa commented 6 years ago

For MSP430 and RISCV32 llvm.trap lowers to a call to the abort function, which is not defined in core or compiler_builtins so it can cause a linker error.

First, lowering into abort() function is just a "default" that generic lowering code in LLVM performs for targets that do not implement lowering of their own. At least RISCV has a dedicated trap instruction in its ISA, so that’s just a shortcoming of the backend that can readily be fixed.

Additionally, there is a function attribute called trap-func-name that when specified will cause all @llvm.trap()s in that function to lower into that specific function call instead, making it possible for us to easily make an override for targets that for some reason do not implement lowering themselves.

Libs team hat on: I’d rather not stabilize the intrinsics module. I view it as an implementation detail that (ideally) could be removed entirely.

Think of intrinsics of a name that could potentially be given up for functions like this (and implementation details becoming private/moving elsewhere). For unreachable specifically, we ended up deciding to create a std::hint module, despite my foresight that it will want to live alongside abort/trap. Since trap does not fit into hint, we can put this function somewhere else, sure. Except there are few or no places where it would fit in thematically, so we might end up creating yet another module that will likely contain just this function.

japaric commented 6 years ago

First, lowering into abort() function is just a "default" that generic lowering code in LLVM performs for targets that do not implement lowering of their own. (..)

Are you suggesting that we "polyfill" llvm.trap in rustc for targets that don't have a dedicated trap instruction? What should such function do when targeting MSP430? It seems to (but I'm no expert) that 16-bit and 8-bit architecture use all their opcode space so there's no undefined opcode that we could use here.

Also, with this approach we would have to take care not only of MSP430 but also of every new architecture that we add in the future and that has this problem (possibly, AVR).

alexcrichton commented 6 years ago

I don't personally have a lot of thoughts about where this should go or what to name it, but I agree with @SimonSapin that we should avoid the intrinsics module

eddyb commented 6 years ago

does ~~Cretonne~~ Cranelift have a trap intrinsic? If so, does it do the same thing as llvm.trap? Also, where would you put this?

It has trap as an instruction, including conditional forms.

As for where to place this, definitely not intrinsics. I would also maybe call it trap instead of abort to distinguish it from libc abort.

cr1901 commented 6 years ago

Re: msp430 and llvm.trap: @awygle and I spent an hour or so discussing how we could implement llvm.trap for msp430 on IRC. We couldn't agree on reasonable semantics/implementation, so currently there are no plans to implement llvm.trap." There are multiple reasonable semantics, but their implementation is ugly (short of deferring to a function provided by msp430-rt that the user implements on a per-application basis).

Therefore I'd prefer that we'd not have a stable trap API on msp430; I would prefer something like core::arch::$ARCH::trap.

eddyb commented 6 years ago

Oh, using core::arch is definitely one way to go about doing this that I'd be fine with.

japaric commented 5 years ago

@gnzlbg given the last comments, do you think we could move forward with core::arch::$ARCH::trap?

gnzlbg commented 5 years ago

do you think we could move forward with core::arch::$ARCH::trap?

Yes, this looks fine to me. Will you send a PR for adding it to the architecture-specific submodules in stdsimd?

Does anyone feel strongly about this intrinsic needing an RFC to stabilize or would an FCP be enough to stabilize it for some of the archs at least?

parched commented 5 years ago

Perhaps if it's going in core::arch::$ARCH::trap it should have an architecture specific name e.g. UDF for arm.

gnzlbg commented 5 years ago

Since I expect this functionality (independent of the name) to be provided at least for the current architectures that stdsimd, I'd expect us to use assert_instr here to guarantee a specific instruction sequence that we reliably generate (and document).

So I checked out the instructions generated by llvm.trap for the following architectures supported by stdsimd (https://gcc.godbolt.org/z/PG2zmQ):

x86 and x86_64: ud2
aarch64: brk #0x1
arm: mov pc, lr @ trap or nothing? (i tried here for many cpus and mattrs combinations, and this often generates nothing here)
ppc32, ppc64, and ppc64le: trap
mips32, mips64, mips64el, mips64r6: break
nvptx: trap
wasm32: unreachable

And things are IMO looking good. All architectures that we support (one exception) have a native trap instruction sequence that llvm.trap generates reliably.

The only exception appears to be arm, but maybe I just did not pass the right command flags to llvm? (I only tried -march=arm -mcpu={cpu name here} for many cpus)

parched commented 5 years ago

@gnzlbg on arm it generates an undefined instruction which your disassembler may shown like .inst and won't be there if you filter out directives.

gnzlbg commented 5 years ago

@parched indeed, on arm the assembly contains an .inst 0xe7ffdefe directive.

japaric commented 5 years ago

Yes, this looks fine to me. Will you send a PR for adding it to the architecture-specific submodules in stdsimd?

Done in rust-lang-nursery/stdsimd#571

gnzlbg commented 5 years ago

Heads up: I've merged @japaric 's implementation of the platform specific "abort" intrinsics to stdsimd here: https://github.com/rust-lang-nursery/stdsimd/pull/571

Note that this does not mean that these are in the path towards stabilization. This issue is about a portable way to abort, which had some unresolved questions that made the discussion derail a bit. These questions are:

How portable would such an intrinsic be? all targets? or only some that support it?
If it should be available on all targets, what to do in platforms that don't have a way to trap? e.g. loop {} ?
where in libcore does this intrinsic belong to?

I think these questions are worth resolving, and once we have consensus on these, I can submit a PR implementing a portable/semi-portable way to trap.

In the meantime, what has been merged into nightly is intrinsics for platform-specific trap instructions. These should be usable from libcore soon, and allow emulating portable ways to trap in external crates. The implementation provides was good, and exposing these for experimentation purposes had achieved enough consensus for them to be usable in nightly.

It is still unclear at this point whether we only want to stabilize a portable way to trap, or whether we want to stabilize both, but I think it is worth holding the stabilization of the platform-specific trap instructions (which would need an RFC) until the unresolved questions about the portable trap intrinsic are resolved.

Afterwards, I would welcome a PR to stdsimd with an implementation (or might submit one myself).

At that point, we will probably have two ways to trap in nightly, a platform-specific way, and a portable/semi-portable way, and will have hopefully gathered enough experience with both that the next step would be to write a small RFC about whatever solution "feels right".

Amanieu commented 5 years ago

IMO we should just always call the abort C function instead of messing around with arch-specific instructions. In terms of code size this is practically identical on most platforms (except x86 where you save 4 bytes), and you have the advantage of having a proper abort error message instead of a confusing "Illegal Instruction" error. In environments without std it is trivial to provide an implementation of this function.

SimonSapin commented 5 years ago

In environments without std it is trivial to provide an implementation of this function.

That’s what this issue is about. I don’t know how to provide that implementation without a libc and without messing with architecture-specific instructions though, could you give an example?

Amanieu commented 5 years ago

I meant that we should just leave this symbol undefined and let the user define it (somewhat like panic_handler).

SimonSapin commented 5 years ago

Could you give an example of how you expect a "typical" embedded application to define this symbol? Especially the divergence part. Do we want to recommend loop {}?

Do you mean that defining this symbol would be mandatory for applications that do not link to std? I don’t think we can do that, since https://blog.rust-lang.org/2018/10/25/Rust-1.30.0.html#no_std-applications

Amanieu commented 5 years ago

An alternative would be to do something like this:

#[unwind(aborts)]
pub fn abort() -> ! {
    panic!("abort")
}

This reuses the panic mechanism to print an abort message, and forces an abort during unwinding.

SimonSapin commented 5 years ago

Today, "panic=abort" means that std’s panic handler calls libpanic_abort::__rust_start_panic instead of libpanic_unwind::__rust_start_panic. The former calls libc::abort or intrinsics::abort depending on the platform.

When std is not linked none of the above is relevant and applications need to defined their own panic handler. This is where intrinsics::abort is typically used, after the message-printing is already taken care of (or deliberately skipped).

So adding "per-function panic=abort" doesn’t help with the fact that no_std applications need a way to diverge in their panic handler.

SimonSapin commented 4 years ago

@jhpratt How is this related to abort?

elichai commented 4 years ago

Any updates here? I have a use for this in a no-std no libc library that gets called from C, right now we panic but that will unwind through C and cause UB.

Amanieu commented 4 years ago

Why would panic be UB? If you're using #[no_std] then you're already required to define a panic handler with #[panic_handler] which can call the C abort function.

Lokathor commented 4 years ago

i think they mean that unwinding as a result of panic would be the UB.

elichai commented 4 years ago

I added an emphasis on library :) A user might use whatever panic handler/strategy they want and/or even use regular std binary.

rust-lang / rfcs

Stabilize `intrinsics::abort` in `core` #2512