JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.64k stars 5.48k forks source link

Strange code generation with `unsafe_trunc` and dynamic dispatch #54264

Closed kimikage closed 5 months ago

kimikage commented 6 months ago

This is a bug that was originally found in ColorTypes.jl, the 32-bit color constructors.(cf. https://github.com/JuliaGraphics/ColorTypes.jl/issues/288) However, this does not appear to be a problem specific to ColorTypes.jl or FixedPointNumbers.jl. Also, this problem is indirectly caused by workaround for ARM's unsafe_trunc, but it also occurs on x64.

The following MWE reproduces the problem with Julia v1.6.7. It appears that the same type of problem occurs with Julia v1.10.2, although I have not succeeded in creating the MWE. (cf: https://github.com/JuliaLang/julia/issues/54264#issuecomment-2079031720)

ColorTypes v0.11.5 CI test log with julia v1.10.2 (aarch64 with QEMU) ```julia Julia Version 1.10.2 Commit bd47eca2c8a (2024-03-01 10:14 UTC) Build Info: Official https://julialang.org/ release Platform Info: OS: Linux (aarch64-linux-gnu) CPU: 4 × AMD EPYC 7763 64-Core Processor WORD_SIZE: 64 LIBM: libopenlibm LLVM: libLLVM-15.0.7 (ORCJIT, generic) Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores) Updating registry at `~/.julia/registries/General.toml` Installed FixedPointNumbers ─ v0.8.4 Updating `/home/runner/work/ColorTypes.jl/ColorTypes.jl/Project.toml` [53c48c17] + FixedPointNumbers v0.8.4 Updating `/home/runner/work/ColorTypes.jl/ColorTypes.jl/Manifest.toml` [53c48c17] + FixedPointNumbers v0.8.4 [56f22d72] + Artifacts [8f399da3] + Libdl [37e2e46d] + LinearAlgebra [9a3f8284] + Random [ea8e919c] + SHA v0.7.0 [9e88b42a] + Serialization [2f01184e] + SparseArrays v1.10.0 [10745b16] + Statistics v1.10.0 [e[66](https://github.com/JuliaGraphics/ColorTypes.jl/actions/runs/8832377192/job/24249571455#step:7:67)e0078] + CompilerSupportLibraries_jll v1.1.0+0 [4536629a] + OpenBLAS_jll v0.3.23+4 [bea87d4a] + SuiteSparse_jll v7.2.1+1 [8e850b90] + libblastrampoline_jll v5.8.0+1 Precompiling project... ✓ CompilerSupportLibraries_jll ✓ Statistics ✓ FixedPointNumbers ✓ ColorTypes 4 dependencies successfully precompiled in 51 seconds. 2 already precompiled. Testing ColorTypes Status `/tmp/jl_SZxydE/Project.toml` [3da002f7] ColorTypes v0.11.5 `/home/runner/work/ColorTypes.jl/ColorTypes.jl` [e30172f5] Documenter v1.4.0 [8dfed614] Test Status `/tmp/jl_SZxydE/Manifest.toml` [a4c015fc] ANSIColoredPrinters v0.0.1 [1520ce14] AbstractTrees v0.4.5 [944b1d66] CodecZlib v0.7.4 [3da002f7] ColorTypes v0.11.5 `/home/runner/work/ColorTypes.jl/ColorTypes.jl` [ffbed154] DocStringExtensions v0.9.3 [e30172f5] Documenter v1.4.0 [53c48c17] FixedPointNumbers v0.8.4 [d7ba0133] Git v1.3.1 [b5f81e59] IOCapture v0.2.4 [692b3bcd] JLLWrappers v1.5.0 [682c06a0] JSON v0.21.4 [0e77f7df] LazilyInitializedFields v1.2.2 [d0879d2d] MarkdownAST v0.1.2 [69de0a69] Parsers v2.8.1 [aea7be01] PrecompileTools v1.2.1 [21216c6a] Preferences v1.4.3 [2792f1a3] RegistryInstances v0.1.0 [3bb[67](https://github.com/JuliaGraphics/ColorTypes.jl/actions/runs/8832377192/job/24249571455#step:7:68)fe8] TranscodingStreams v0.10.7 [2e619515] Expat_jll v2.5.0+0 [f8c6e375] Git_jll v2.44.0+2 [94ce4f54] Libiconv_jll v1.17.0+0 [458c3c95] OpenSSL_jll v3.0.13+1 [0dad84c5] ArgTools v1.1.1 [56f22d72] Artifacts [2a0f44e3] Base64 [ade2ca70] Dates [f43a241f] Downloads v1.6.0 [7b1f6079] FileWatching [b77e0a4c] InteractiveUtils [b27032c2] LibCURL v0.6.4 [76f85450] LibGit2 [8f399da3] Libdl [37e2e46d] LinearAlgebra [56ddb016] Logging [d6f4376e] Markdown [a63ad114] Mmap [ca575930] NetworkOptions v1.2.0 [44cfe95a] Pkg v1.10.0 [de0858da] Printf [3fa0cd96] REPL [9a3f8284] Random [ea8e919c] SHA v0.7.0 [9e88b42a] Serialization [6462fe0b] Sockets [2f01184e] SparseArrays v1.10.0 [10745b16] Statistics v1.10.0 [fa267f1f] TOML v1.0.3 [a4e569a6] Tar v1.10.0 [8dfed614] Test [cf7118a7] UUIDs [4ec0a83e] Unicode [e66e0078] CompilerSupportLibraries_jll v1.1.0+0 [deac9b47] LibCURL_jll v8.4.0+0 [e37daf67] LibGit2_jll v1.6.4+0 [29816b5a] LibSSH2_jll v1.11.0+1 [c8ffd9c3] MbedTLS_jll v2.28.2+1 [14a3606d] MozillaCACerts_jll v2023.1.10 [4536629a] OpenBLAS_jll v0.3.23+4 [efcefdf7] PCRE2_jll v10.42.0+1 [bea87d4a] SuiteSparse_jll v7.2.1+1 [83775a58] Zlib_jll v1.2.13+1 [8e850b90] libblastrampoline_jll v5.8.0+1 [8e850ede] nghttp2_jll v1.52.0+1 [3f19e933] p7zip_jll v17.4.0+2 Precompiling project... ✓ CompilerSupportLibraries_jll ✓ LazilyInitializedFields ✓ ANSIColoredPrinters ✓ TranscodingStreams ✓ DocStringExtensions ✓ IOCapture ✓ Preferences ✓ AbstractTrees ✓ PrecompileTools ✓ RegistryInstances ✓ Statistics ✓ TranscodingStreams → TestExt ✓ JLLWrappers ✓ MarkdownAST ✓ CodecZlib ✓ OpenSSL_jll ✓ Libiconv_jll ✓ Expat_jll ✓ Git_jll ✓ FixedPointNumbers ✓ Git ✓ ColorTypes ✓ Parsers ✓ JSON ✓ Documenter 25 dependencies successfully precompiled in 192 seconds. 5 already precompiled. Testing Running tests... Skipping Base.active_repl Skipping Base.active_repl_backend ┌ Warning: Unable to determine HTML(edit_link = ...) from remote HEAD branch, defaulting to "master". │ Calling `git remote` failed with an exception. Set JULIA_DEBUG=Documenter to see the error. │ Unless this is due to a configuration error, the relevant variable should be set explicitly. └ @ Documenter ~/.julia/packages/Documenter/pA5Sa/src/utilities/utilities.jl:640 [ Info: SetupBuildDirectory: setting up build directory. [ Info: Doctest: running doctests. [ Info: Skipped ExpandTemplates step (doctest only). [ Info: Skipped CrossReferences step (doctest only). [ Info: Skipped CheckDocument step (doctest only). [ Info: Skipped Populate step (doctest only). [ Info: Skipped RenderDocument step (doctest only). Test Summary: | Pass Total Time Doctests: ColorTypes | 1 1 1m27.2s Test Summary: | Pass Total Time error_hints | 27 27 15.5s Test Summary: | Pass Broken Total Time conversions | 1307 16 1323 2m38.7s Test Summary: | Pass Broken Total Time operations | 755 14 769 1m46.0s Test Summary: | Pass Broken Total Time show | 45 4 49 8.7s ┌ Warning: one(Gray{Float32}) will soon switch to returning 1; you might need to switch to `oneunit` │ caller = macro expansion at Test.jl:669 [inlined] └ @ Core /home/runner/work/julia/share/julia/stdlib/v1.10/Test/src/Test.jl:669 Test Summary: | Pass Broken Total Time traits | 537 7 544 31.2s transparent gray constructors: Test Failed at /home/runner/work/ColorTypes.jl/ColorTypes.jl/test/types.jl:2[68](https://github.com/JuliaGraphics/ColorTypes.jl/actions/runs/8832377192/job/24249571455#step:7:69) Expression: AGray32(val, 0.8) === AGray32(0.2, 0.8) Evaluated: AGray32(0.2N0f8,0.0N0f8) === AGray32(0.2N0f8,0.8N0f8) Stacktrace: [1] macro expansion @ /home/runner/work/julia/share/julia/stdlib/v1.10/Test/src/Test.jl:6[72](https://github.com/JuliaGraphics/ColorTypes.jl/actions/runs/8832377192/job/24249571455#step:7:73) [inlined] [2] macro expansion @ /home/runner/work/ColorTypes.jl/ColorTypes.jl/test/types.jl:268 [inlined] [3] macro expansion @ /home/runner/work/julia/share/julia/stdlib/v1.10/Test/src/Test.jl:15[77](https://github.com/JuliaGraphics/ColorTypes.jl/actions/runs/8832377192/job/24249571455#step:7:78) [inlined] [4] top-level scope @ /home/runner/work/ColorTypes.jl/ColorTypes.jl/test/types.jl:261 ```

So far, this problem has not been observed on the nightly build (v1.12.0-DEV).

"""
`unsafe_trunc` of ARM returns zero with an unsigned integer type and a negative floating-point number.
So there is a workaround via a signed integer.
"""
function n0f8(x)
    unsafe_trunc(UInt8, unsafe_trunc(Int8, x * 255.0f0))
    # or
    # reinterpret(UInt8, unsafe_trunc(Int8, x * 255.0f0))
end

n24f8(::Val{:A}, x) = UInt32(n0f8(x)) # Adding @noinline seems to avoid this problem.
n24f8(::Val{:B}, x) = UInt32(n0f8(x)) # Of course, that is not desired in this case.

function print_n24f8(values...)
    for v in values
        println(n24f8(v, 0.8))
    end
end

print_n24f8(Val(:A), Val(:A))
print_n24f8(Val(:A), Val(:B)) # wrong
print_n24f8(Val(:B), Val(:A)) # wrong
print_n24f8(Val(:B), Val(:B))
julia> print_n24f8(Val(:A), Val(:A))
204
204

julia> print_n24f8(Val(:A), Val(:B)) # wrong
204
0

julia> print_n24f8(Val(:B), Val(:A)) # wrong
204
0

julia> print_n24f8(Val(:B), Val(:B))
204
204

julia> versioninfo()
Julia Version 1.6.7
Commit 3b76b25b64 (2022-07-19 15:11 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-11.0.1 (ORCJIT, tigerlake)

I have not been able to identify the cause, so the issue title may be misleading. Note that I suspect this is a constant propagation problem, but I am not certain. I think the behavior of unsafe_trunc differs between the runtime native code and the constant propagation. Edit: Of course, if you specify a value such as 0.4 instead of 0.8, it is definitely safe to truncate.

KristofferC commented 6 months ago

Without doing any analyzing it seems that the results are consistent on 1.7 at least.

kimikage commented 6 months ago

However, as mentioned above, this type of problem seems to be occurring even on v1.10.2. This is just my MWE being poor. I am rather interested in the issues on v1.10.

kimikage commented 6 months ago

Although the following depends on ColorTypes and FixedPointNumbers, it reproduces the problem on Julia v1.10.2. (This is close to the actual code implemented in ColorTypes)

using ColorTypes # v0.11.5
using FixedPointNumbers # v0.8.4

function n0f8(x)
    reinterpret(N0f8, unsafe_trunc(UInt8, unsafe_trunc(Int8, x * 255.0f0)))
end

function agray32(val::Real, alpha=1)
    AGray32(n0f8(val), n0f8(alpha))
end
function agray32(g::Gray24, alpha=1)
    reinterpret(AGray32, UInt32(reinterpret(n0f8(alpha))) << 24 | g.color)
end
agray32(g::AbstractGray, alpha=1) = agray32(gray(g), alpha)

# The length of the Tuple (or the variety of types) affects the results.
for v in (N0f8(0.2), N0f16(0.2), Gray{N0f8}(0.2), Gray24(0.2))
    println(agray32(v, 0.8))
end
julia> for v in (N0f8(0.2), N0f16(0.2), Gray{N0f8}(0.2), Gray24(0.2))
           println(agray32(v, 0.8))
       end
AGray32(0.2N0f8,0.8N0f8)
AGray32(0.2N0f8,0.8N0f8)
AGray32(0.2N0f8,0.8N0f8)
AGray32(0.2N0f8,0.0N0f8)

julia> agray32(Gray24(0.2), 0.8)
AGray32(0.2N0f8,0.8N0f8)

julia> versioninfo()
Julia Version 1.10.2
Commit bd47eca2c8 (2024-03-01 10:14 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: 8 × 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, tigerlake)
Threads: 1 default, 0 interactive, 1 GC (on 8 virtual cores)

Not tested for v1.10.3.

kimikage commented 6 months ago

v1.12.0-DEV is fine or robust enough.

julia> for v in (N0f8(0.2), N0f16(0.2), Gray{N0f8}(0.2), Gray24(0.2))
           println(agray32(v, 0.8))
       end
AGray32(0.2N0f8, 0.8N0f8)
AGray32(0.2N0f8, 0.8N0f8)
AGray32(0.2N0f8, 0.8N0f8)
AGray32(0.2N0f8, 0.8N0f8)

julia> versioninfo()
Julia Version 1.12.0-DEV.391
Commit 715c476d6a (2024-04-23 08:12 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: 8 × 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, tigerlake)
Threads: 1 default, 0 interactive, 1 GC (on 8 virtual cores)
vtjnash commented 6 months ago

@keno do you want to add this function to your UB document, or do we need to remove it from the consistent list for functions? (I am strongly inclined to remove all float intrinsics from that list indeed)

gbaraldi commented 6 months ago

What specifically is the issue here?

kimikage commented 6 months ago

I recognize that this is more a problem of unintuitive behavior as a result of unsafe_trunc's ~UB~ roughly documented but undesirable behavior, than the fact that unsafe_trunc would return an arbitrary value itself. No demons came out of my nose, but a loud sigh came out of my mouth. I think that is worth documenting as an example of the behavior.

kimikage commented 6 months ago

I think there should be more supplemental documentation for unsafe_trunc apart from the UB documentation (https://github.com/JuliaLang/julia/pull/54099). "Examples 2." in the following unofficial document that comes up in a web search is wrong both in terms of formal and practical sense. https://www.jlhub.com/julia/manual/en/function/unsafe_trunc

Edit: xref: #54297

kimikage commented 6 months ago

I removed reinterpret from the title as it is probably not the direct cause of the problem. However, since this seems to have to do with optimization, even seemingly unrelated pieces such as conversion to UInt32 can affect the result.

Keno commented 6 months ago

@Keno do you want to add this function to your UB document, or do we need to remove it from the consistent list for functions? (I am strongly inclined to remove all float intrinsics from that list indeed)

Needs to be removed from the consistent list along with the other floating point intrinsics (#49353).

kimikage commented 5 months ago

Would you close this as a common issue regarding ~UB~ roughly documented but undesirable behavior? (We should separate the documentation issue into another issue or PR. Edit: #54297)

I intend to fix (mitigate) the FixedPointNumbers bug in the near future by means of reducing the likelihood of encountering ~UB~ roughly documented but undesirable behavior.

Or do we analyze this strange behavior and work to reduce the risk of potential bugs emerging? I think this is more generally a matter of whether to be aggressive or conservative in optimization when encountering ~UB~ roughly documented but undesirable behavior. To be more specific, I think this is a matter of being consistent in that policy.

Keno commented 5 months ago

unsafe_trunc is not UB. In the original example, this is working as documented - with an out of range value you can an arbitrary garbage value (but not UB, which would e.g. allow this to segfault).

kimikage commented 5 months ago

Yes, I think this is a problem with "poison value" in the context of LLVM. However, I don't think the term "poison value" is defined in Julia documentation.

Keno commented 5 months ago

No, it's no a poison value, we freeze it. It's just what the documentation says. An arbitrary (possibly different on every invocation) value of the specified type.

kimikage commented 5 months ago

I think you are definitively right. However, since we do not have the official definitions of such terms, we should not use them self-referentially in discussions here. Maybe we need to define macro so_called_str.

kimikage commented 5 months ago

Just to confirm, there is no motivation at this time to specialize for small types like Int8 or UInt8 to mitigate this problem, like the Float16 specialization?

This is not a unsafe_trunc bug, so I agree to do it or not. However, IMO, it is better not to do ad-hoc specialization because it may result in different behavior in different julia versions, and reduced maintainability in packages.

kimikage commented 5 months ago

This is not directly related to UB. Also, I have submitted the PR for the docstring and the proposal is a general fact that is not dependent on this issue.