JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.42k stars 5.45k forks source link

Floating point intrinsics are not IPO :consistent #49353

Open Keno opened 1 year ago

Keno commented 1 year ago

Right now we consider all non-fastmath floating point intrinsics (other than muladd, which LLVM may implement as either an fma or a multiply add) as :consistent. However, as discussed in https://discourse.llvm.org/t/semantics-of-nan/66729/1, LLVM has underdefined semantics for NaN propagation. In particular, if the inputs to these intrinsics are NaN, LLVM is allowed to return any NaN input or a canonicalized NaN, but makes no semantics guarantee on this. As a result, floating point implement using LLVM semantics is not :consistent (even if the underlying hardware would be) and we cannot mark it as such.

Unfortunately, this would likely lose us a significant fraction of the constant-propagation power that the effect system has brought us. There are a couple of things we can try to recover this:

  1. We have a provision in the effect system for non-IPO effects. These were added with the intention of being used for fastmath floating point intrinsics but apply here too. Currently they are unused, but we should be able to use them to do constant propagation in the optimizer.

  2. We could consider doing some sort of precondition-inference to be able to determine that these functions are in fact consistent as long as the input is not NaN (which we can check before constant propagation). I don't quite know how to represent this, but it seems doable.

  3. We could consider explicitly normalizing all NaNs after every arithmetic operation. Some CPUs (e.g. RISC-V) have these semantics anyway (so it would be free). For others it would potentially introduce additional instructions, but I would expect canonicalization to be reasonably optimizable, since you can generally push it to just before any memory store or other escape. Nevertheless, the potential performance implications are a bit scary.

  4. We could consider changing the definition of === on floating point numbers to implicitly canonicalize all NaNs. This would let us avoid doing the canonicalization explicitly, but it would break the invariant that isbits types are compared using exact memory comparison, which I think is too much of a trade off to be feasible, though I wanted to list it as an option.

vtjnash commented 1 year ago

Could we make them :consistent_or_nan, which has the semantics of :consistent, but if inference sees NaN (either an argument or a return value), it widens it?

mikmoore commented 1 year ago

Can someone elaborate more on the true harm of inaccurately marking these :consistent? At an superficial level, returning an undefined NaN does not seem intrinsically harmful in isolation. It's also not out-of-line with the current lean of our yet-unfinalized NaN policy.

For example, @noinline outline_plus(x,y) = x + y seems that it should be :consistent so long as the @noinline is respected and no constant propagation occurs (which might be guaranteed with outlining?).

I'm imagining the issue would arise in situations like

const customNaN = reinterpret(Float64,-1) # !== NaN
f(x) = customNaN === customNaN + x

The example function f(x) should return true except possibly not when isnan(x) && x !== customNaN. Whether it returns true for these other NaNs is a compilation and hardware detail that depends on what value is returned from customNaN + x. The linked discussion also contains some mention of architectures that aggressively canonicalize NaNs, which might further complicate this question.

(I will ignore any surprise changes to sign bits in this next paragraph, but the hardware could presumably mess with those too) This particular function could evaluate to either customNaN === customNaN or customNaN === x on NaN-propagating hardware, depending on the specific compilation and hardware propagation. Since compilation is context-dependent, we cannot know which comparison would be chosen even if we fixed the hardware. On NaN-canonicalizing hardware this would always evaluate to customNaN === CANONICAL_NAN (or CANONICAL_NAN === CANONICAL_NAN if customNaN is canonicalized on load).

Is this an example where the :consistent label would lead to issues?

LilithHafner commented 1 year ago

Could we fix this by redocumenting :consistent to mean that for egal inputs the manner of termination will always be the same and if a consistent function returns then it is acceptable for the compiler to replace any value returned by the function with any other value returned by the function on the same inputs?

For example (hopefully nobody ever does this),

Base.@assume_effects :consistent f() = rand()

would be valid and f() would have the semantics that it could return an arbitrary Float64 between 0 and 1 but must still increment the global rng state.

Keno commented 1 year ago

Could we make them :consistent_or_nan, which has the semantics of :consistent, but if inference sees NaN (either an argument or a return value), it widens it?

No, there could be interior nans that make the return value non-deterministic.

Is this an example where the :consistent label would lead to issues?

Yes pretty much.

Could we fix this by redocumenting :consistent to mean that for egal inputs the manner of termination will always be the same and if a consistent function returns then it is acceptable for the compiler to replace any value returned by the function with any other value returned by the function on the same inputs?

No, we can't do this for IPO :consistent, but this is pretty close to the non-IPO consistent (option 1).

topolarity commented 1 year ago

Can someone elaborate more on the true harm of inaccurately marking these :consistent? At an superficial level, returning an undefined NaN does not seem intrinsically harmful in isolation. It's also not out-of-line with the current lean of our https://github.com/JuliaLang/julia/issues/48523.

As far as I understand, the main problem is when we try to do concrete eval in inference.

Roughly speaking, inference asks the question "What are all the possible outputs of this function?". The use case for concrete eval is that if the inputs are all compile-time-known and the function is fully deterministic, then this question can be answered quite easily: There is only one possible output and it's the one you get when you just run the function.

The problem for IPO :consistency here is that the LLVM optimizer can end up effectively adding non-determinacy to the function if that function observes, e.g., whether the output of 0 / 0 is signed or not (which may not be preserved by LLVM optimizations and is also not specified by IEEE 754).

In that case, we told inference that a function would only evaluate to a certain result, but when the program runs, it turns out we were wrong and it evaluates to something else. If inference made important assumptions based on that (e.g. pruning dead code branches), we hit UB.

(@Keno please correct me if I got any details wrong)

Keno commented 1 year ago

(@Keno please correct me if I got any details wrong)

That is correct. For posterity, discussion from slack this morning: https://gist.github.com/Keno/ec6e87cd0abffc7d776e1bdf38e0a74c