JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.06k stars 5.43k forks source link

Concatenation of string constants isn't concretely evaluated, but is inferred as a `Const` #54921

Open Seelengrab opened 3 weeks ago

Seelengrab commented 3 weeks ago

Consider this silly little function:

f() = "foo" * "bar"

which concatenates two string constants, "foo" and "bar", into one bigger constant "foobar". The compiler agrees, and according to Cthulhu, infers that this is a constant:

julia> f() = "foo" * "bar"
f (generic function with 1 method)

julia> @descend f()
f() @ Main REPL[4]:1
1 f()::Core.Const("foobar") = "foo" * "bar"::String
[...]

However, even though it's inferred as a constant (and thus "foobar" is already allocated somewhere!), the compiler doesn't actually concretely evaluate this:

julia> @code_llvm f()
; Function Signature: f()
;  @ REPL[4]:1 within `f`
define nonnull ptr @julia_f_4243() #0 {
top:
  %0 = call nonnull ptr @"j_*_4246"(ptr nonnull @"jl_global#4247.jit", ptr nonnull @"jl_global#4248.jit")
  ret ptr %0
}

julia> @code_native dump_module=false f()
    .text
; ┌ @ REPL[4]:1 within `f`
    push    rbp
    mov rbp, rsp
    mov rax, qword ptr [r13 + 16]
    movabs  rdi, 139273527936000
    lea rsi, [rdi + 56]
    mov rax, qword ptr [rax + 16]
    mov rax, qword ptr [rax]
    movabs  rax, offset "*"
    call    rax
    pop rbp
    ret
; └
; ┌ @ REPL[4]:1 within `<invalid>`
    nop dword ptr [rax + rax]
; └

Preferring to do this at runtime instead.

Effects look wonderful too:

julia> Base.infer_effects(f)
(+c,+e,+n,+t,+s,+m,+u)

So why doesn't the compiler fold this away, since the result was inferred as Core.Const & presumably allocated during that?

I was able to reproduce this on 1.11 & master.


I've labelled this with compiler:optimizer, but I'm not sure this fits there exactly. Feel free to relabel as appropriate.

gbaraldi commented 3 weeks ago

Cthlhu disagrees with code typed and

Base.infer_effects(*, (String, String))
(+c,+e,+n,+t,!s,!m,+u,+o)
Seelengrab commented 3 weeks ago

Which version are you seeing that on? On 1.11-alpha2 I see this:

image

Which makes sense, since the arguments are Const and thus are inaccessible. It's just weird that this doesn't end up as a simple return of a constant on the LLVM level, since the concatenation was already computed for the Core.Const("foobar") :thinking:

Keno commented 3 weeks ago

Strings are explicitly excluded from being able to be inlined constants:

https://github.com/JuliaLang/julia/blob/5654e6043823717e085239f6509413410106e902/base/compiler/utilities.jl#L88-L89

This could possibly be changed

Seelengrab commented 3 weeks ago

That's good to know! The comment above that ("No definite size") indicates that this is primarily due to size concerns, but in practice, aren't most string constants that would be const-eval eligible going to be small?

Thinking further, in a static compilation setting it's common to pool global string constants/literals together, for deduplication purposes - not sure how feasible that is with our String, which IIRC guarantees storing a NUL at the end.. Maybe that's a possible avenue for optimizations?