JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.9k stars 5.49k forks source link

Mutation of `Ref` embedded in Expr treated as effect-free #52843

Closed topolarity closed 10 months ago

topolarity commented 10 months ago

MWE in 1.10:

julia> macro foo()
          return :($ref1[] = 1; nothing)
       end
@foo (macro with 1 method)

julia> ref1 = Ref{Int}()
Base.RefValue{Int64}(140632197190944)

julia> bar() = (@foo; nothing)
bar (generic function with 1 method)

julia> bar()

julia> ref1
Base.RefValue{Int64}(140632197190944)

julia> @foo

julia> ref1
Base.RefValue{Int64}(1)

Correct output in 1.9:

julia> macro foo()
          return :($ref1[] = 1; nothing)
       end
@foo (macro with 1 method)

julia> ref1 = Ref{Int}()
Base.RefValue{Int64}(140223960686096)

julia> bar() = (@foo; nothing)
bar (generic function with 1 method)

julia> bar()

julia> ref1
Base.RefValue{Int64}(1)

julia> @foo

julia> ref1
Base.RefValue{Int64}(1)

Appears to be another variant of https://github.com/JuliaLang/julia/issues/52531

gbaraldi commented 10 months ago

It's quite likely we are missing a QuoteNode here

julia> @code_lowered bar()
CodeInfo(
1 ─     Base.setindex!(Base.RefValue{Int64}(0), 1)
│       Main.nothing
└──     return Main.nothing
)
vtjnash commented 10 months ago

QuoteNode is not expected in IR for self-quoting values

gbaraldi commented 10 months ago

Then this is purely an effects bug. I, and I think @vchuravy talked a bit about this in https://github.com/JuliaLang/julia/pull/52536. But basically it doesn't make sense for a caller to inherit INACCESSIBLEMEM_OR_ARGMEMONLY from a callee, if you know the arguments are mutable then we have to taint this as ALWAYS_FALSE unless EA can prove otherwise. https://github.com/JuliaLang/julia/pull/52596 tried doing this, but I think the bug is in the effects.

Specifically I think https://github.com/JuliaLang/julia/blob/0c46852901c63b33f0603b0afd58ff1da687d760/base/compiler/typeinfer.jl#L471-L475 should have an else branch where the effect is tainted. And the tfuncs should do something like this as well. What I noticed is that we never refine INACCESSIBLEMEM_OR_ARGMEMONLY to ALWAYS_FALSE, the caller then inherits it (which is weird) and if the caller has no mutable arguments we refine to ALWAYS_TRUE, which then allows CONSISTENT_IF_INACCESSIBLEMEMONLY to be refined. This makes a function that calls something that is ?c,?m to be c,m

gbaraldi commented 10 months ago

While talking with @topolarity we were thinking that our current memory effects are a bit weird, which makes analyzing them even weirder. I think they were initially sort of inspired by LLVMs semantics, and maybe we should take that inspiration again.

Currentyl we have inacessiblemem and friends to note what scope we are accessing that memory, while effect free/consistency try to figure out what kind of access we do (read/write). I believe this confluence isn't so good, and something more well defined, similar to what LLVM has, where you have an explicit scope, and an explicit kind, are probably more interesting, i.e. reads global, but writes to argmem, or combinations of this.

memory(...) This attribute specifies the possible memory effects of the call-site or function. It allows specifying the possible access kinds (none, read, write, or readwrite) for the possible memory location kinds (argmem, inaccessiblemem, as well as a default). It is best understood by example: memory(none): Does not access any memory. memory(read): May read (but not write) any memory. memory(write): May write (but not read) any memory. memory(readwrite): May read or write any memory. memory(argmem: read): May only read argument memory. memory(argmem: read, inaccessiblemem: write): May only read argument memory and only write inaccessible memory. memory(read, argmem: readwrite): May read any memory (default mode) and additionally write argument memory. memory(readwrite, argmem: none): May access any memory apart from argument memory. The supported memory location kinds are: argmem: This refers to accesses that are based on pointer arguments to the function. inaccessiblemem: This refers to accesses to memory which is not accessible by the current module (before return from the function – an allocator function may return newly accessible memory while only accessing inaccessible memory itself). Inaccessible memory is often used to model control dependencies of intrinsics. The default access kind (specified without a location prefix) applies to all locations that haven’t been specified explicitly, including those that don’t currently have a dedicated location kind (e.g. accesses to globals or captured pointers).

@Keno and @aviatesk might be interested

Keno commented 10 months ago

We have more precise memory information in the escape analysis cache. The effects are just an over approximation of this information that people added because it was quick to check.

Keno commented 10 months ago

This diff fixes the issue, but the resulting precision loss breaks a lot of tests. We'll probably have to do a compensating dataflow analysis to recover the flag in the common cases.

diff --git a/base/compiler/abstractinterpretation.jl b/base/compiler/abstractinterpretation.jl
index b7f5d636e2..98480008ad 100644
--- a/base/compiler/abstractinterpretation.jl
+++ b/base/compiler/abstractinterpretation.jl
@@ -2737,10 +2737,6 @@ function abstract_eval_statement(interp::AbstractInterpreter, @nospecialize(e),
                 effects = Effects(effects; noub=ALWAYS_TRUE)
             end
         end
-        if effects.inaccessiblememonly == INACCESSIBLEMEM_OR_ARGMEMONLY
-            # TODO: Could check if all the arguments are our direct arguments also
-            effects = Effects(effects; inaccessiblememonly=ALWAYS_FALSE)
-        end
         e = e::Expr
         @assert !isa(rt, TypeVar) "unhandled TypeVar"
         rt = maybe_singleton_const(rt)
@@ -2759,6 +2755,10 @@ function abstract_eval_statement(interp::AbstractInterpreter, @nospecialize(e),
     override = decode_statement_effects_override(sv)
     effects = override_effects(effects, override)
     set_curr_ssaflag!(sv, flags_for_effects(effects), IR_FLAGS_EFFECTS)
+    if effects.inaccessiblememonly == INACCESSIBLEMEM_OR_ARGMEMONLY
+        # TODO: Could check if all the arguments are our direct arguments also
+        effects = Effects(effects; inaccessiblememonly=ALWAYS_FALSE)
+    end
     merge_effects!(interp, sv, effects)

     return RTEffects(rt, exct, effects)
topolarity commented 10 months ago

The way that inaccessiblememonly is handled there looks correct to me though. I don't see why we need to relax it - it's just predicated on us having not discovered any access to mutable (non-argument-derived) global memory in the current function or any callees.

Would it not work to walk the IR stmts and taint inaccessiblememonly upon encountering any embedded mutable object?

aviatesk commented 10 months ago

@topolarity Right. This is another instance of #52531, and the easiest fix would be:

diff --git a/base/compiler/abstractinterpretation.jl b/base/compiler/abstractinterpretation.jl
index 3701fa8cf9..c5cb4698ca 100644
--- a/base/compiler/abstractinterpretation.jl
+++ b/base/compiler/abstractinterpretation.jl
@@ -2335,8 +2335,8 @@ function abstract_eval_special_value(interp::AbstractInterpreter, @nospecialize(
     elseif isa(e, GlobalRef)
         return abstract_eval_globalref(interp, e, sv)
     end
-
-    return RTEffects(Const(e), Union{}, EFFECTS_TOTAL)
+    return RTEffects(Const(e), Union{}, Effects(EFFECTS_TOTAL;
+        inaccessiblememonly = is_mutation_free_argtype(typeof(e)) ? ALWAYS_TRUE : ALWAYS_FALSE))
 end

 function abstract_eval_value_expr(interp::AbstractInterpreter, e::Expr, sv::AbsIntState)
aviatesk commented 10 months ago

The deviation from LLVM's :inaccessiblememonly might be making the situation more complex. The primary reason I added this effect bit was to simply enhance the precision of effect analysis, not to emulate LLVM's memory model at the Julia level. Gaining an LLVM-like memory model in Julia would be fantastic, but this should probably be done with escape analysis rather than within the existing framework of effect analysis as @Keno pointed out. The real value of our effect analysis is that it is performed along with the abstract interpretation, enabling some compiler techniques like concrete evaluation at cheap cost.

vchuravy commented 10 months ago

@vtjnash

QuoteNode is not expected in IR for self-quoting values

What is the set of self-quoting values? My fix in #52596 was based on the belief that Core.Box is not self-quoting. #52856 seems to fix the issue broadly.

@aviatesk we should probably revert #52596, except for the tests.

aviatesk commented 10 months ago

Yeah, I agree. Will make a PR.

vtjnash commented 10 months ago

All values are self-quoting except for names(Core.IR)

topolarity commented 10 months ago

All values are self-quoting except for names(Core.IR)

Does that mean we can update these functions to match? https://github.com/JuliaLang/julia/blob/9c78420b2705636fe5f3ff111950294a2addc515/base/compiler/utilities.jl#L69-L76