JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.73k stars 5.48k forks source link

An intrinisic/function/macro like assume #52851

Open gbaraldi opened 10 months ago

gbaraldi commented 10 months ago

One thing that is sometimes useful when writing very optimized code is to tell the compiler that x thing may not happen (basically declaring the behaviour of the function undefined if it doesn't meet the condition), which allows for extra optimizations to happen. We have a version of this in @assume_effects, but as the discussion in https://github.com/JuliaLang/julia/pull/52828 showed, there are some places where we can't get around it.

For a syntax proposal I imagine something similar to the new C++23 assume might be interesting, which for us probably looks like

function foo(x)
    @assume x > 0
    sqrt(x)
end

Which would elide the check in foo. This made me think of the effect preconditions discussion that @Keno proposed a while ago, which would have effects conditional on something, and I guess @assume would be the manual override.

For reference these are the assumes that LLVM has https://llvm.org/docs/LangRef.html#int-assume and https://llvm.org/docs/LangRef.html#assume-opbundles

adienes commented 10 months ago

ref https://github.com/JuliaLang/julia/issues/51729

giordano commented 10 months ago

@nsajko has a package with this functionality: https://gitlab.com/nsajko/UnsafeAssume.jl

nsajko commented 10 months ago

All of C++, Rust and LLVM provide an "unreachable" operation, which can be used for implementing assume, but is simpler to specify:

Once unreachable is available, assume may be implemented like this:

function assume(c::Bool)
  c || unreachable()
  nothing
end

And unreachable itself is basically just implemented as a Core.Intrinsics.llvmcall("unreachable", Union{}, Tuple{}) in the UnsafeAssume.jl package.

Implementing unreachable using the llvmcall currently has downsides, however, as the llvmcall is a black box for Julia. A problem for UnsafeAssume.jl currently is that the dead code elimination only happens at the LLVM stage (see #27547), which means, e.g., that Julia can't prove :nothrow even if the llvmcall causes all relevant exception-related instructions to be eliminated at the LLVM-stage.

But I don't want to stop emitting the unreachable LLVM instruction, even when Julia starts doing more DCE on its own, because there will always be some code that LLVM will be able to eliminate even though Julia was not able to. So I wonder, wouldn't it be easier for the Julia implementation to just recognize the semantics of Core.Intrinsics.llvmcall("unreachable", Union{}, Tuple{}), than to introduce a whole new intrinsic?

That seems like it would provide all the benefits of an unreachable-like Julia-level intrinsic, while not requiring any code changes for Julia users, or maintaining more intrinsics for the Julia implementors.

nsajko commented 10 months ago

Note that Julia already infers the type of Core.Intrinsics.llvmcall("unreachable", Union{}, Tuple{}) as Union{} (i.e., unreachable), so I guess this issue would mostly just reduce to #27547, once Julia is able to see through the llvmcall black box.

PallHaraldsson commented 10 months ago

Rust has: pub const unsafe fn unreachable_unchecked() -> !

Since "unsafe" (UB), maybe name it @unsafe_assume rather than just @assume? Would it be as good to have an assert "check" that you could put outside of loops, that does an actual check not typical assert, thus no UB? We might need both and for the possibility of UB for the other option to be disablable globally with a CLI option, i.e. turn it into an actual check, and allowing that check to be moved outside of loops.

I think we need a module Unsafe (also Aliasing module?) and then we might get away with the @assume name, by having it there, not exported from Base. And @inbounds could go there too (for 2.0)...

Possibly no need to add anything to Julia since there's UnsafeAssume.jl, just document its existence in Julia docs? Then simply close this issue?

nsajko commented 10 months ago

We might need both and for the possibility of UB for the other option to be disablable globally with a CLI option, i.e. turn it into an actual check

Good idea, just like we already have --check-bounds=yes. I'll try adding something like a preference chosen at compile-time to UnsafeAssume.jl.

nsajko commented 10 months ago

Another issue that currently exists with using UnsafeAssume is that the llvmcall interferes with inlining heuristics, because of the "inlining cost" assigned for a llvmcall. If Julia could recognize that specific llvmcall (llvmcall("unreachable", Union{}, Tuple{})), it would make sense for that statement to get an inlining cost of zero.

To sum up, it seems to me something like this may be a good idea (although I don't know much about the Julia compiler or LLVM TBH):

  1. The Julia implementation should recognize Core.Intrinsics.llvmcall("unreachable", Union{}, Tuple{}) in Julia code.

  2. The llvmcall should be deleted from the user's code at an early stage of compilation, to prevent it from interfering with inlining and other optimizations.

  3. The location of the deleted llvmcall should be marked as ::Union{}/unreachable, to facilitate Julia-level DCE (xref. #27547).

  4. Just before passing LLVM IR to LLVM for generating executable code, put an unreachable LLVM instruction at the location where the llvmcall was initially

vtjnash commented 10 months ago

ccall(llvmcall) is probably more accurate there (llvmcall is usually less reliable, since the llvm textual IR format is unstable), but in any case, this is unsafe (undefined behavior), so probably not that much of a great idea to have

nsajko commented 10 months ago

this is unsafe (undefined behavior)

What do you mean? Is that not the very goal of this issue? Are you just referring to the stability of LLVM IR or to something else?

Seelengrab commented 9 months ago

I think this has the potential to be as insidious in creating bugs as unchecked @inbounds is, if not more so, since the result is likely "just" completely wrong and not lead to a fatal segfault. Debugging problems arising from misusing this is going to be extremely difficult.

nsajko commented 9 months ago

Well, obviously. The goal with something like assume or unreachable is letting the programmer introduce undefined behavior.

My package is already available in any case.

KristofferC commented 9 months ago

Wouldn't it be possible to have a (debug) mode where the assumptions are checked?

adienes commented 9 months ago

I still do not understand the difference between this proposal and what @assert (is intended to be), except for reversing the opt-in / opt-out default

nsajko commented 6 months ago

xref #39340 #53705

nsajko commented 6 months ago

As already pointed out, @assert could function like assume if configured (perhaps via Preferences.jl) to assume truth instead of checking and throwing. Another thing that just occurred to me is that typeassert/:: could, in that scenario, get the same treatment, where the user could choose via Preferences.jl to make all type assertions unchecked assumptions.