JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.44k stars 5.46k forks source link

v1.7.1 on Power9 nodes -- Broken iterator behavior #43803

Open xorJane opened 2 years ago

xorJane commented 2 years ago

Hello!

I am seeing test failures (from Base.runtests()) for Julia v1.7.1 on IBM Power9 nodes that seem to be related to iterators/generators and even parity.

In particular, I see the following unexpected behavior:

julia> [x for x in 1:10 if iseven(x)]
Int64[]

julia> collect(x for x in 1:10 if x % 2 == 0)
Int64[]

Things work as expected, however, if I check for odd parity or avoid using an iterator:

julia> [x for x in 1:10 if isodd(x)]
5-element Vector{Int64}:
 1
 3
 5
 7
 9

julia> collect(x for x in 1:10 if x % 2 == 1)
5-element Vector{Int64}:
 1
 3
 5
 7
 9

julia> [x for x in [1, 2, 3, 4] if iseven(x)]
2-element Vector{Int64}:
 2
 4

My system info is

julia> versioninfo()
Julia Version 1.7.1
Commit ac5cc99908 (2021-12-22 19:35 UTC)
Platform Info:
  OS: Linux (ppc64le-redhat-linux)
  CPU: POWER9, altivec supported
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-12.0.1 (ORCJIT, pwr9)

I am performing a standard build by cloning the Julia repo, checking out v1.7.1, and running make -j. @vchuravy recommended a Make.user file populated with USE_BINARYBUILDER_LLVM=0 for Power nodes a few Julia versions ago. I see the same behavior above after building both with and without this file.

Note that on Power9 nodes, I see the behavior described above on v1.7.0 as well, but not on v1.6.4.

Please let me know if you have recommendations for how to fix this behavior!

Thank you, Jane

ViralBShah commented 2 years ago

Hi @xorJane - Do you have a way to report this to the IBM power team? Since @vchuravy has added the upstream label, I suspect this may be an LLVM on power issue.

vchuravy commented 2 years ago

@nemanjai this looks similar to the bug I mentioned in our last meeting. But I haven't found the time to reduce that to a nicer bug-report/

xorJane commented 2 years ago

Thanks for the feedback! I reported to IBM on my side and will let you know what I learn.

nemanjai commented 2 years ago

If this is an LLVM bug, I am certainly very happy to look at it. Would you be able to provide a reproducer in LLVM IR?

vchuravy commented 2 years ago

@nemanjai I can confirm that this got fixed in Julia 1.8, by upgrading to LLVM 13

e.g 4c45f292a0e688f7b32b1116f9feefc95f93f14d is broken and 4ebca2ffdfa8ad808a4395612383b3bf23ab3cc1 is fixed. If we can identify the patch, I can backport it to 1.7

vchuravy commented 2 years ago
julia> g() =  iterate(x for x in 1:10 if iseven(x))
g (generic function with 1 method)

julia> iterate(x for x in 1:10 if iseven(x))

julia> g()
(2, 2)

Those should be equivalent.

julia> function test(dest, itr)
           y = iterate(itr)
           y === nothing && return dest # replace with return false and behaviour goes away
           return true
       end
test (generic function with 2 methods)

julia> test(Int[], x for x in 1:10 if iseven(x))
Int64[]
vchuravy commented 2 years ago

@nemanjai

julia> @code_llvm dump_module=true raw=true test(Int[], x for x in 1:10 if iseven(x))
; ModuleID = 'test'
source_filename = "test"
target datalayout = "e-m:e-i64:64-n32:64-v256:256:256-v512:512:512"
target triple = "powerpc64le-unknown-linux-gnu"

;  @ REPL[47]:1 within `test`
define { {} addrspace(10)*, i8 } @julia_test_2999([1 x i8]* noalias nocapture align 1 dereferenceable(1) %0, {} addrspace(10)* nonnull align 16 dereferenceable(40) %1, { { [2 x i64] } } addrspace(11)* nocapture nonnull readonly align 8 dereferenceable(16) %2) #0 !dbg !5 {
top:
;  @ REPL[47]:2 within `test`
; ┌ @ generator.jl:44 within `iterate` @ iterators.jl:470 @ range.jl:880
; │┌ @ range.jl:655 within `isempty`
; ││┌ @ range.jl:817 within `first`
; │││┌ @ Base.jl:38 within `getproperty`
      %3 = getelementptr inbounds { { [2 x i64] } }, { { [2 x i64] } } addrspace(11)* %2, i64 0, i32 0, i32 0, i64 0, !dbg !7
; ││└└
; ││┌ @ range.jl:822 within `last`
; │││┌ @ Base.jl:38 within `getproperty`
      %4 = getelementptr inbounds { { [2 x i64] } }, { { [2 x i64] } } addrspace(11)* %2, i64 0, i32 0, i32 0, i64 1, !dbg !24
; ││└└
; ││┌ @ operators.jl:378 within `>`
; │││┌ @ int.jl:83 within `<`
      %5 = load i64, i64 addrspace(11)* %4, align 8, !dbg !27, !tbaa !33
      %6 = load i64, i64 addrspace(11)* %3, align 8, !dbg !27, !tbaa !33
      %.not = icmp slt i64 %5, %6, !dbg !27
; │└└└
; │ @ generator.jl:44 within `iterate` @ iterators.jl:471
   br i1 %.not, label %nonnull, label %oksrem, !dbg !37

L26:                                              ; preds = %oksrem
; │ @ generator.jl:44 within `iterate` @ iterators.jl:475 @ range.jl:884
; │┌ @ promotion.jl:473 within `==`
    %7 = icmp eq i64 %value_phi515, %5, !dbg !38
; │└
   %8 = add i64 %value_phi515, 1, !dbg !41
; │ @ generator.jl:44 within `iterate` @ iterators.jl:471
   br i1 %7, label %nonnull, label %oksrem, !dbg !37

L49:                                              ; preds = %oksrem
; └
;  @ REPL[47]:4 within `test`
  ret { {} addrspace(10)*, i8 } { {} addrspace(10)* addrspacecast ({}* inttoptr (i64 140735093367936 to {}*) to {} addrspace(10)*), i8 -127 }, !dbg !43

oksrem:                                           ; preds = %top, %L26
  %value_phi515 = phi i64 [ %8, %L26 ], [ %6, %top ]
;  @ REPL[47]:2 within `test`
; ┌ @ generator.jl:44 within `iterate` @ iterators.jl:472
; │┌ @ none within `#60`
; ││┌ @ int.jl:133 within `iseven`
; │││┌ @ number.jl:42 within `iszero`
; ││││┌ @ promotion.jl:473 within `==`
       %9 = and i64 %value_phi515, 1, !dbg !44
       %.not14 = icmp eq i64 %9, 0, !dbg !44
; │└└└└
   br i1 %.not14, label %L49, label %L26, !dbg !53

nonnull:                                          ; preds = %L26, %top
; └
;  @ REPL[47]:3 within `test`
  %10 = bitcast {} addrspace(10)* %1 to i64 addrspace(10)*, !dbg !54
  %11 = getelementptr inbounds i64, i64 addrspace(10)* %10, i64 -1, !dbg !54
  %12 = load atomic i64, i64 addrspace(10)* %11 unordered, align 8, !dbg !54, !tbaa !55, !range !58
  %13 = and i64 %12, -16, !dbg !54
  %14 = inttoptr i64 %13 to {}*, !dbg !54
  %15 = addrspacecast {}* %14 to {} addrspace(10)*, !dbg !54
  %phi.cmp = icmp eq {} addrspace(10)* %15, addrspacecast ({}* inttoptr (i64 140735091669296 to {}*) to {} addrspace(10)*), !dbg !54
  %16 = zext i1 %phi.cmp to i8, !dbg !54
  %17 = or i8 %16, -128, !dbg !54
  %18 = insertvalue { {} addrspace(10)*, i8 } undef, {} addrspace(10)* %1, 0, !dbg !54
  %19 = insertvalue { {} addrspace(10)*, i8 } %18, i8 %17, 1, !dbg !54
  ret { {} addrspace(10)*, i8 } %19, !dbg !54
}

define nonnull {} addrspace(10)* @jfptr_test_3000({} addrspace(10)* %0, {} addrspace(10)** %1, i32 %2) #1 {
top:
  %gcframe2 = alloca [3 x {} addrspace(10)*], align 16
  %gcframe2.sub = getelementptr inbounds [3 x {} addrspace(10)*], [3 x {} addrspace(10)*]* %gcframe2, i64 0, i64 0
  %3 = bitcast [3 x {} addrspace(10)*]* %gcframe2 to i8*
  call void @llvm.memset.p0i8.i32(i8* nonnull align 16 dereferenceable(24) %3, i8 0, i32 24, i1 false), !tbaa !59
  %4 = call {}*** inttoptr (i64 268437872 to {}*** ()*)() #7
  %5 = bitcast [3 x {} addrspace(10)*]* %gcframe2 to i64*
  store i64 4, i64* %5, align 16, !tbaa !59
  %6 = load {}**, {}*** %4, align 8
  %7 = getelementptr inbounds [3 x {} addrspace(10)*], [3 x {} addrspace(10)*]* %gcframe2, i64 0, i64 1
  %8 = bitcast {} addrspace(10)** %7 to {}***
  store {}** %6, {}*** %8, align 8, !tbaa !59
  %9 = bitcast {}*** %4 to {} addrspace(10)***
  store {} addrspace(10)** %gcframe2.sub, {} addrspace(10)*** %9, align 8
  %10 = alloca [1 x i8], align 1
  %11 = load {} addrspace(10)*, {} addrspace(10)** %1, align 8, !nonnull !4, !dereferenceable !61, !align !62
  %12 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)** %1, i64 1
  %13 = bitcast {} addrspace(10)** %12 to { { [2 x i64] } } addrspace(10)**
  %14 = load { { [2 x i64] } } addrspace(10)*, { { [2 x i64] } } addrspace(10)** %13, align 8, !nonnull !4, !dereferenceable !62, !align !63
  %15 = addrspacecast { { [2 x i64] } } addrspace(10)* %14 to { { [2 x i64] } } addrspace(11)*
  %16 = getelementptr inbounds [3 x {} addrspace(10)*], [3 x {} addrspace(10)*]* %gcframe2, i64 0, i64 2
  %17 = bitcast {} addrspace(10)** %16 to { { [2 x i64] } } addrspace(10)**
  store { { [2 x i64] } } addrspace(10)* %14, { { [2 x i64] } } addrspace(10)** %17, align 16
  %18 = call { {} addrspace(10)*, i8 } @julia_test_2999([1 x i8]* noalias nocapture nonnull %10, {} addrspace(10)* %11, { { [2 x i64] } } addrspace(11)* nocapture readonly %15) #0
  %19 = extractvalue { {} addrspace(10)*, i8 } %18, 1
  %20 = extractvalue { {} addrspace(10)*, i8 } %18, 0
  %cond = icmp eq i8 %19, 1
  %21 = getelementptr inbounds [1 x i8], [1 x i8]* %10, i64 0, i64 0
  %22 = load i8, i8* %21, align 1
  %23 = and i8 %22, 1
  %.not = icmp eq i8 %23, 0
  %24 = select i1 %.not, {} addrspace(10)* addrspacecast ({}* inttoptr (i64 140735092919696 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 140735093367936 to {}*) to {} addrspace(10)*)
  %25 = select i1 %cond, {} addrspace(10)* %24, {} addrspace(10)* %20
  %26 = load {} addrspace(10)*, {} addrspace(10)** %7, align 8, !tbaa !59
  %27 = bitcast {}*** %4 to {} addrspace(10)**
  store {} addrspace(10)* %26, {} addrspace(10)** %27, align 8, !tbaa !59
  ret {} addrspace(10)* %25
}

; Function Attrs: noreturn
declare void @ijl_throw({} addrspace(12)*) #2

; Function Attrs: norecurse nounwind readnone
declare nonnull {} addrspace(10)* @julia.typeof({} addrspace(10)*) #3

; Function Attrs: inaccessiblemem_or_argmemonly
declare void @ijl_gc_queue_root({} addrspace(10)*) #4

; Function Attrs: allocsize(1)
declare noalias nonnull {} addrspace(10)* @ijl_gc_pool_alloc(i8*, i32, i32) #5

; Function Attrs: allocsize(1)
declare noalias nonnull {} addrspace(10)* @ijl_gc_big_alloc(i8*, i64) #5

declare noalias nonnull {} addrspace(10)** @julia.new_gc_frame(i32)

declare void @julia.push_gc_frame({} addrspace(10)**, i32)

declare {} addrspace(10)** @julia.get_gc_frame_slot({} addrspace(10)**, i32)

declare void @julia.pop_gc_frame({} addrspace(10)**)

; Function Attrs: argmemonly nofree nosync nounwind willreturn writeonly
declare void @llvm.memset.p0i8.i32(i8* nocapture writeonly, i8, i32, i1 immarg) #6

attributes #0 = { "probe-stack"="inline-asm" }
attributes #1 = { "probe-stack"="inline-asm" "thunk" }
attributes #2 = { noreturn }
attributes #3 = { norecurse nounwind readnone }
attributes #4 = { inaccessiblemem_or_argmemonly }
attributes #5 = { allocsize(1) }
attributes #6 = { argmemonly nofree nosync nounwind willreturn writeonly }
attributes #7 = { nounwind readnone }

!llvm.module.flags = !{!0, !1}
!llvm.dbg.cu = !{!2}

!0 = !{i32 2, !"Dwarf Version", i32 4}
!1 = !{i32 2, !"Debug Info Version", i32 3}
!2 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !3, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: GNU)
!3 = !DIFile(filename: "REPL[47]", directory: ".")
!4 = !{}
!5 = distinct !DISubprogram(name: "test", linkageName: "julia_test_2999", scope: null, file: !3, line: 1, type: !6, scopeLine: 1, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!6 = !DISubroutineType(types: !4)
!7 = !DILocation(line: 38, scope: !8, inlinedAt: !10)
!8 = distinct !DISubprogram(name: "getproperty;", linkageName: "getproperty", scope: !9, file: !9, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!9 = !DIFile(filename: "Base.jl", directory: ".")
!10 = !DILocation(line: 817, scope: !11, inlinedAt: !13)
!11 = distinct !DISubprogram(name: "first;", linkageName: "first", scope: !12, file: !12, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!12 = !DIFile(filename: "range.jl", directory: ".")
!13 = !DILocation(line: 655, scope: !14, inlinedAt: !15)
!14 = distinct !DISubprogram(name: "isempty;", linkageName: "isempty", scope: !12, file: !12, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!15 = !DILocation(line: 880, scope: !16, inlinedAt: !17)
!16 = distinct !DISubprogram(name: "iterate;", linkageName: "iterate", scope: !12, file: !12, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!17 = !DILocation(line: 470, scope: !18, inlinedAt: !20)
!18 = distinct !DISubprogram(name: "iterate;", linkageName: "iterate", scope: !19, file: !19, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!19 = !DIFile(filename: "iterators.jl", directory: ".")
!20 = !DILocation(line: 44, scope: !21, inlinedAt: !23)
!21 = distinct !DISubprogram(name: "iterate;", linkageName: "iterate", scope: !22, file: !22, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!22 = !DIFile(filename: "generator.jl", directory: ".")
!23 = !DILocation(line: 2, scope: !5)
!24 = !DILocation(line: 38, scope: !8, inlinedAt: !25)
!25 = !DILocation(line: 822, scope: !26, inlinedAt: !13)
!26 = distinct !DISubprogram(name: "last;", linkageName: "last", scope: !12, file: !12, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!27 = !DILocation(line: 83, scope: !28, inlinedAt: !30)
!28 = distinct !DISubprogram(name: "<;", linkageName: "<", scope: !29, file: !29, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!29 = !DIFile(filename: "int.jl", directory: ".")
!30 = !DILocation(line: 378, scope: !31, inlinedAt: !13)
!31 = distinct !DISubprogram(name: ">;", linkageName: ">", scope: !32, file: !32, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!32 = !DIFile(filename: "operators.jl", directory: ".")
!33 = !{!34, !34, i64 0}
!34 = !{!"jtbaa_const", !35, i64 0}
!35 = !{!"jtbaa", !36, i64 0}
!36 = !{!"jtbaa"}
!37 = !DILocation(line: 471, scope: !18, inlinedAt: !20)
!38 = !DILocation(line: 473, scope: !39, inlinedAt: !41)
!39 = distinct !DISubprogram(name: "==;", linkageName: "==", scope: !40, file: !40, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!40 = !DIFile(filename: "promotion.jl", directory: ".")
!41 = !DILocation(line: 884, scope: !16, inlinedAt: !42)
!42 = !DILocation(line: 475, scope: !18, inlinedAt: !20)
!43 = !DILocation(line: 4, scope: !5)
!44 = !DILocation(line: 473, scope: !39, inlinedAt: !45)
!45 = !DILocation(line: 42, scope: !46, inlinedAt: !48)
!46 = distinct !DISubprogram(name: "iszero;", linkageName: "iszero", scope: !47, file: !47, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!47 = !DIFile(filename: "number.jl", directory: ".")
!48 = !DILocation(line: 133, scope: !49, inlinedAt: !50)
!49 = distinct !DISubprogram(name: "iseven;", linkageName: "iseven", scope: !29, file: !29, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!50 = !DILocation(line: 0, scope: !51, inlinedAt: !53)
!51 = distinct !DISubprogram(name: "#60;", linkageName: "#60", scope: !52, file: !52, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!52 = !DIFile(filename: "none", directory: ".")
!53 = !DILocation(line: 472, scope: !18, inlinedAt: !20)
!54 = !DILocation(line: 3, scope: !5)
!55 = !{!56, !56, i64 0}
!56 = !{!"jtbaa_tag", !57, i64 0}
!57 = !{!"jtbaa_data", !35, i64 0}
!58 = !{i64 4096, i64 0}
!59 = !{!60, !60, i64 0}
!60 = !{!"jtbaa_gcframe", !35, i64 0}
!61 = !{i64 40}
!62 = !{i64 16}
!63 = !{i64 8}
vchuravy commented 2 years ago

Ok simpler:

julia> function test(dest, itr)
           y = iterate(itr)
           y === nothing && return dest
           return false
       end
test (generic function with 2 methods)

julia> test(true, x for x in 1:10 if iseven(x))
true
julia> @code_llvm dump_module=true raw=true test(true, x for x in 1:10 if iseven(x))
; ModuleID = 'test'
source_filename = "test"
target datalayout = "e-m:e-i64:64-n32:64-v256:256:256-v512:512:512"
target triple = "powerpc64le-unknown-linux-gnu"

;  @ REPL[54]:1 within `test`
define i8 @julia_test_3023(i8 zeroext %0, { { [2 x i64] } } addrspace(11)* nocapture nonnull readonly align 8 dereferenceable(16) %1) #0 !dbg !5 {
top:
;  @ REPL[54]:2 within `test`
; ┌ @ generator.jl:44 within `iterate` @ iterators.jl:470 @ range.jl:880
; │┌ @ range.jl:655 within `isempty`
; ││┌ @ range.jl:817 within `first`
; │││┌ @ Base.jl:38 within `getproperty`
      %2 = getelementptr inbounds { { [2 x i64] } }, { { [2 x i64] } } addrspace(11)* %1, i64 0, i32 0, i32 0, i64 0, !dbg !7
; ││└└
; ││┌ @ range.jl:822 within `last`
; │││┌ @ Base.jl:38 within `getproperty`
      %3 = getelementptr inbounds { { [2 x i64] } }, { { [2 x i64] } } addrspace(11)* %1, i64 0, i32 0, i32 0, i64 1, !dbg !24
; ││└└
; ││┌ @ operators.jl:378 within `>`
; │││┌ @ int.jl:83 within `<`
      %4 = load i64, i64 addrspace(11)* %3, align 8, !dbg !27, !tbaa !33
      %5 = load i64, i64 addrspace(11)* %2, align 8, !dbg !27, !tbaa !33
      %.not = icmp slt i64 %4, %5, !dbg !27
; │└└└
; │ @ generator.jl:44 within `iterate` @ iterators.jl:471
   br i1 %.not, label %L41, label %oksrem, !dbg !37

L26:                                              ; preds = %oksrem
; │ @ generator.jl:44 within `iterate` @ iterators.jl:475 @ range.jl:884
; │┌ @ promotion.jl:473 within `==`
    %6 = icmp eq i64 %value_phi515, %4, !dbg !38
; │└
   %7 = add i64 %value_phi515, 1, !dbg !41
; │ @ generator.jl:44 within `iterate` @ iterators.jl:471
   br i1 %6, label %L41, label %oksrem, !dbg !37

L41:                                              ; preds = %oksrem, %L26, %top
; └
;  @ REPL[54]:3 within `test`
  %merge = phi i8 [ %0, %top ], [ 0, %oksrem ], [ %0, %L26 ], !dbg !43
  ret i8 %merge, !dbg !43

oksrem:                                           ; preds = %top, %L26
  %value_phi515 = phi i64 [ %7, %L26 ], [ %5, %top ]
;  @ REPL[54]:2 within `test`
; ┌ @ generator.jl:44 within `iterate` @ iterators.jl:472
; │┌ @ none within `#68`
; ││┌ @ int.jl:133 within `iseven`
; │││┌ @ number.jl:42 within `iszero`
; ││││┌ @ promotion.jl:473 within `==`
       %8 = and i64 %value_phi515, 1, !dbg !44
       %.not14 = icmp eq i64 %8, 0, !dbg !44
; │└└└└
   br i1 %.not14, label %L41, label %L26, !dbg !53
; └
}

define nonnull {} addrspace(10)* @jfptr_test_3024({} addrspace(10)* %0, {} addrspace(10)** %1, i32 %2) #1 {
top:
  %gcframe2 = alloca [3 x {} addrspace(10)*], align 16
  %gcframe2.sub = getelementptr inbounds [3 x {} addrspace(10)*], [3 x {} addrspace(10)*]* %gcframe2, i64 0, i64 0
  %3 = bitcast [3 x {} addrspace(10)*]* %gcframe2 to i8*
  call void @llvm.memset.p0i8.i32(i8* nonnull align 16 dereferenceable(24) %3, i8 0, i32 24, i1 false), !tbaa !54
  %4 = call {}*** inttoptr (i64 268437872 to {}*** ()*)() #6
  %5 = bitcast [3 x {} addrspace(10)*]* %gcframe2 to i64*
  store i64 4, i64* %5, align 16, !tbaa !54
  %6 = load {}**, {}*** %4, align 8
  %7 = getelementptr inbounds [3 x {} addrspace(10)*], [3 x {} addrspace(10)*]* %gcframe2, i64 0, i64 1
  %8 = bitcast {} addrspace(10)** %7 to {}***
  store {}** %6, {}*** %8, align 8, !tbaa !54
  %9 = bitcast {}*** %4 to {} addrspace(10)***
  store {} addrspace(10)** %gcframe2.sub, {} addrspace(10)*** %9, align 8
  %10 = bitcast {} addrspace(10)** %1 to i8 addrspace(10)**
  %11 = load i8 addrspace(10)*, i8 addrspace(10)** %10, align 8, !nonnull !4, !dereferenceable !56, !align !56
  %12 = addrspacecast i8 addrspace(10)* %11 to i8 addrspace(11)*
  %13 = load i8, i8 addrspace(11)* %12, align 1
  %14 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)** %1, i64 1
  %15 = bitcast {} addrspace(10)** %14 to { { [2 x i64] } } addrspace(10)**
  %16 = load { { [2 x i64] } } addrspace(10)*, { { [2 x i64] } } addrspace(10)** %15, align 8, !nonnull !4, !dereferenceable !57, !align !58
  %17 = addrspacecast { { [2 x i64] } } addrspace(10)* %16 to { { [2 x i64] } } addrspace(11)*
  %18 = getelementptr inbounds [3 x {} addrspace(10)*], [3 x {} addrspace(10)*]* %gcframe2, i64 0, i64 2
  %19 = bitcast {} addrspace(10)** %18 to { { [2 x i64] } } addrspace(10)**
  store { { [2 x i64] } } addrspace(10)* %16, { { [2 x i64] } } addrspace(10)** %19, align 16
  %20 = call i8 @julia_test_3023(i8 zeroext %13, { { [2 x i64] } } addrspace(11)* nocapture readonly %17) #0
  %21 = and i8 %20, 1
  %.not = icmp eq i8 %21, 0
  %22 = select i1 %.not, {} addrspace(10)* addrspacecast ({}* inttoptr (i64 140735092919696 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspacecast ({}* inttoptr (i64 140735093367936 to {}*) to {} addrspace(10)*)
  %23 = load {} addrspace(10)*, {} addrspace(10)** %7, align 8, !tbaa !54
  %24 = bitcast {}*** %4 to {} addrspace(10)**
  store {} addrspace(10)* %23, {} addrspace(10)** %24, align 8, !tbaa !54
  ret {} addrspace(10)* %22
}

; Function Attrs: noreturn
declare void @ijl_throw({} addrspace(12)*) #2

; Function Attrs: inaccessiblemem_or_argmemonly
declare void @ijl_gc_queue_root({} addrspace(10)*) #3

; Function Attrs: allocsize(1)
declare noalias nonnull {} addrspace(10)* @ijl_gc_pool_alloc(i8*, i32, i32) #4

; Function Attrs: allocsize(1)
declare noalias nonnull {} addrspace(10)* @ijl_gc_big_alloc(i8*, i64) #4

declare noalias nonnull {} addrspace(10)** @julia.new_gc_frame(i32)

declare void @julia.push_gc_frame({} addrspace(10)**, i32)

declare {} addrspace(10)** @julia.get_gc_frame_slot({} addrspace(10)**, i32)

declare void @julia.pop_gc_frame({} addrspace(10)**)

; Function Attrs: argmemonly nofree nosync nounwind willreturn writeonly
declare void @llvm.memset.p0i8.i32(i8* nocapture writeonly, i8, i32, i1 immarg) #5

attributes #0 = { "probe-stack"="inline-asm" }
attributes #1 = { "probe-stack"="inline-asm" "thunk" }
attributes #2 = { noreturn }
attributes #3 = { inaccessiblemem_or_argmemonly }
attributes #4 = { allocsize(1) }
attributes #5 = { argmemonly nofree nosync nounwind willreturn writeonly }
attributes #6 = { nounwind readnone }

!llvm.module.flags = !{!0, !1}
!llvm.dbg.cu = !{!2}

!0 = !{i32 2, !"Dwarf Version", i32 4}
!1 = !{i32 2, !"Debug Info Version", i32 3}
!2 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !3, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: GNU)
!3 = !DIFile(filename: "REPL[54]", directory: ".")
!4 = !{}
!5 = distinct !DISubprogram(name: "test", linkageName: "julia_test_3023", scope: null, file: !3, line: 1, type: !6, scopeLine: 1, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!6 = !DISubroutineType(types: !4)
!7 = !DILocation(line: 38, scope: !8, inlinedAt: !10)
!8 = distinct !DISubprogram(name: "getproperty;", linkageName: "getproperty", scope: !9, file: !9, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!9 = !DIFile(filename: "Base.jl", directory: ".")
!10 = !DILocation(line: 817, scope: !11, inlinedAt: !13)
!11 = distinct !DISubprogram(name: "first;", linkageName: "first", scope: !12, file: !12, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!12 = !DIFile(filename: "range.jl", directory: ".")
!13 = !DILocation(line: 655, scope: !14, inlinedAt: !15)
!14 = distinct !DISubprogram(name: "isempty;", linkageName: "isempty", scope: !12, file: !12, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!15 = !DILocation(line: 880, scope: !16, inlinedAt: !17)
!16 = distinct !DISubprogram(name: "iterate;", linkageName: "iterate", scope: !12, file: !12, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!17 = !DILocation(line: 470, scope: !18, inlinedAt: !20)
!18 = distinct !DISubprogram(name: "iterate;", linkageName: "iterate", scope: !19, file: !19, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!19 = !DIFile(filename: "iterators.jl", directory: ".")
!20 = !DILocation(line: 44, scope: !21, inlinedAt: !23)
!21 = distinct !DISubprogram(name: "iterate;", linkageName: "iterate", scope: !22, file: !22, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!22 = !DIFile(filename: "generator.jl", directory: ".")
!23 = !DILocation(line: 2, scope: !5)
!24 = !DILocation(line: 38, scope: !8, inlinedAt: !25)
!25 = !DILocation(line: 822, scope: !26, inlinedAt: !13)
!26 = distinct !DISubprogram(name: "last;", linkageName: "last", scope: !12, file: !12, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!27 = !DILocation(line: 83, scope: !28, inlinedAt: !30)
!28 = distinct !DISubprogram(name: "<;", linkageName: "<", scope: !29, file: !29, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!29 = !DIFile(filename: "int.jl", directory: ".")
!30 = !DILocation(line: 378, scope: !31, inlinedAt: !13)
!31 = distinct !DISubprogram(name: ">;", linkageName: ">", scope: !32, file: !32, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!32 = !DIFile(filename: "operators.jl", directory: ".")
!33 = !{!34, !34, i64 0}
!34 = !{!"jtbaa_const", !35, i64 0}
!35 = !{!"jtbaa", !36, i64 0}
!36 = !{!"jtbaa"}
!37 = !DILocation(line: 471, scope: !18, inlinedAt: !20)
!38 = !DILocation(line: 473, scope: !39, inlinedAt: !41)
!39 = distinct !DISubprogram(name: "==;", linkageName: "==", scope: !40, file: !40, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!40 = !DIFile(filename: "promotion.jl", directory: ".")
!41 = !DILocation(line: 884, scope: !16, inlinedAt: !42)
!42 = !DILocation(line: 475, scope: !18, inlinedAt: !20)
!43 = !DILocation(line: 3, scope: !5)
!44 = !DILocation(line: 473, scope: !39, inlinedAt: !45)
!45 = !DILocation(line: 42, scope: !46, inlinedAt: !48)
!46 = distinct !DISubprogram(name: "iszero;", linkageName: "iszero", scope: !47, file: !47, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!47 = !DIFile(filename: "number.jl", directory: ".")
!48 = !DILocation(line: 133, scope: !49, inlinedAt: !50)
!49 = distinct !DISubprogram(name: "iseven;", linkageName: "iseven", scope: !29, file: !29, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!50 = !DILocation(line: 0, scope: !51, inlinedAt: !53)
!51 = distinct !DISubprogram(name: "#68;", linkageName: "#68", scope: !52, file: !52, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!52 = !DIFile(filename: "none", directory: ".")
!53 = !DILocation(line: 472, scope: !18, inlinedAt: !20)
!54 = !{!55, !55, i64 0}
!55 = !{!"jtbaa_gcframe", !35, i64 0}
!56 = !{i64 1}
!57 = !{i64 16}
!58 = !{i64 8}
vchuravy commented 2 years ago

@nemanjai I just noticed that we don't have zeroext on the return value, that shouldn't matter here, but would that be a general issue?

nemanjai commented 2 years ago

@nemanjai I can confirm that this got fixed in Julia 1.8, by upgrading to LLVM 13

e.g 4c45f29 is broken and 4ebca2f is fixed. If we can identify the patch, I can backport it to 1.7

I can certainly bisect this.

P.S. Can I have a main for this so I can run it after building with the good/bad compiler?

vchuravy commented 2 years ago

P.S. Can I have a main for this so I can run it after building with the good/bad compiler?

Sure thing:

; ModuleID = 'test'
source_filename = "test"
target datalayout = "e-m:e-i64:64-n32:64-v256:256:256-v512:512:512"
target triple = "powerpc64le-unknown-linux-gnu"

;  @ REPL[2]:1 within `test`
define i8 @julia_test_113(i8 zeroext %0, { { [2 x i64] } }* nocapture nonnull readonly align 8 dereferenceable(16) %1) #0 !dbg !5 {
top:
;  @ REPL[2]:2 within `test`
; ┌ @ generator.jl:44 within `iterate` @ iterators.jl:470 @ range.jl:880
; │┌ @ range.jl:655 within `isempty`
; ││┌ @ range.jl:817 within `first`
; │││┌ @ Base.jl:38 within `getproperty`
      %2 = getelementptr inbounds { { [2 x i64] } }, { { [2 x i64] } }* %1, i64 0, i32 0, i32 0, i64 0, !dbg !7
; ││└└
; ││┌ @ range.jl:822 within `last`
; │││┌ @ Base.jl:38 within `getproperty`
      %3 = getelementptr inbounds { { [2 x i64] } }, { { [2 x i64] } }* %1, i64 0, i32 0, i32 0, i64 1, !dbg !24
; ││└└
; ││┌ @ operators.jl:378 within `>`
; │││┌ @ int.jl:83 within `<`
      %4 = load i64, i64* %3, align 8, !dbg !27, !tbaa !33
      %5 = load i64, i64* %2, align 8, !dbg !27, !tbaa !33
      %.not = icmp slt i64 %4, %5, !dbg !27
; │└└└
; │ @ generator.jl:44 within `iterate` @ iterators.jl:471
   br i1 %.not, label %L41, label %oksrem, !dbg !37

L26:                                              ; preds = %oksrem
; │ @ generator.jl:44 within `iterate` @ iterators.jl:475 @ range.jl:884
; │┌ @ promotion.jl:473 within `==`
    %6 = icmp eq i64 %value_phi515, %4, !dbg !38
; │└
   %7 = add i64 %value_phi515, 1, !dbg !41
; │ @ generator.jl:44 within `iterate` @ iterators.jl:471
   br i1 %6, label %L41, label %oksrem, !dbg !37

L41:                                              ; preds = %oksrem, %L26, %top
; └
;  @ REPL[2]:3 within `test`
  %merge = phi i8 [ %0, %top ], [ 0, %oksrem ], [ %0, %L26 ], !dbg !43
  ret i8 %merge, !dbg !43

oksrem:                                           ; preds = %top, %L26
  %value_phi515 = phi i64 [ %7, %L26 ], [ %5, %top ]
;  @ REPL[2]:2 within `test`
; ┌ @ generator.jl:44 within `iterate` @ iterators.jl:472
; │┌ @ none within `#10`
; ││┌ @ int.jl:133 within `iseven`
; │││┌ @ number.jl:42 within `iszero`
; ││││┌ @ promotion.jl:473 within `==`
       %8 = and i64 %value_phi515, 1, !dbg !44
       %.not14 = icmp eq i64 %8, 0, !dbg !44
; │└└└└
   br i1 %.not14, label %L41, label %L26, !dbg !53
; └
}

define i8 @main() {
  %arg = alloca {{[2 x i64]}}
  %start = getelementptr inbounds { { [2 x i64] } }, { { [2 x i64] } }* %arg, i64 0, i32 0, i32 0, i64 0
  %end = getelementptr inbounds { { [2 x i64] } }, { { [2 x i64] } }* %arg, i64 0, i32 0, i32 0, i64 1
  store i64 1, i64* %start
  store i64 10, i64* %end
  %ret = call i8 @julia_test_113(i8 zeroext 1, {{[2 x i64]}}* %arg)
  ret i8 %ret
}

; Function Attrs: noreturn
declare void @ijl_throw({} addrspace(12)*) #2

; Function Attrs: inaccessiblemem_or_argmemonly
declare void @ijl_gc_queue_root({} addrspace(10)*) #3

; Function Attrs: allocsize(1)
declare noalias nonnull {} addrspace(10)* @ijl_gc_pool_alloc(i8*, i32, i32) #4

; Function Attrs: allocsize(1)
declare noalias nonnull {} addrspace(10)* @ijl_gc_big_alloc(i8*, i64) #4

declare noalias nonnull {} addrspace(10)** @julia.new_gc_frame(i32)

declare void @julia.push_gc_frame({} addrspace(10)**, i32)

declare {} addrspace(10)** @julia.get_gc_frame_slot({} addrspace(10)**, i32)

declare void @julia.pop_gc_frame({} addrspace(10)**)

; Function Attrs: argmemonly nofree nosync nounwind willreturn writeonly
declare void @llvm.memset.p0i8.i32(i8* nocapture writeonly, i8, i32, i1 immarg) #5

attributes #0 = { "probe-stack"="inline-asm" }
attributes #1 = { "probe-stack"="inline-asm" "thunk" }
attributes #2 = { noreturn }
attributes #3 = { inaccessiblemem_or_argmemonly }
attributes #4 = { allocsize(1) }
attributes #5 = { argmemonly nofree nosync nounwind willreturn writeonly }
attributes #6 = { nounwind readnone }

!llvm.module.flags = !{!0, !1}
!llvm.dbg.cu = !{!2}

!0 = !{i32 2, !"Dwarf Version", i32 4}
!1 = !{i32 2, !"Debug Info Version", i32 3}
!2 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !3, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: GNU)
!3 = !DIFile(filename: "REPL[2]", directory: ".")
!4 = !{}
!5 = distinct !DISubprogram(name: "test", linkageName: "julia_test_113", scope: null, file: !3, line: 1, type: !6, scopeLine: 1, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!6 = !DISubroutineType(types: !4)
!7 = !DILocation(line: 38, scope: !8, inlinedAt: !10)
!8 = distinct !DISubprogram(name: "getproperty;", linkageName: "getproperty", scope: !9, file: !9, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!9 = !DIFile(filename: "Base.jl", directory: ".")
!10 = !DILocation(line: 817, scope: !11, inlinedAt: !13)
!11 = distinct !DISubprogram(name: "first;", linkageName: "first", scope: !12, file: !12, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!12 = !DIFile(filename: "range.jl", directory: ".")
!13 = !DILocation(line: 655, scope: !14, inlinedAt: !15)
!14 = distinct !DISubprogram(name: "isempty;", linkageName: "isempty", scope: !12, file: !12, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!15 = !DILocation(line: 880, scope: !16, inlinedAt: !17)
!16 = distinct !DISubprogram(name: "iterate;", linkageName: "iterate", scope: !12, file: !12, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!17 = !DILocation(line: 470, scope: !18, inlinedAt: !20)
!18 = distinct !DISubprogram(name: "iterate;", linkageName: "iterate", scope: !19, file: !19, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!19 = !DIFile(filename: "iterators.jl", directory: ".")
!20 = !DILocation(line: 44, scope: !21, inlinedAt: !23)
!21 = distinct !DISubprogram(name: "iterate;", linkageName: "iterate", scope: !22, file: !22, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!22 = !DIFile(filename: "generator.jl", directory: ".")
!23 = !DILocation(line: 2, scope: !5)
!24 = !DILocation(line: 38, scope: !8, inlinedAt: !25)
!25 = !DILocation(line: 822, scope: !26, inlinedAt: !13)
!26 = distinct !DISubprogram(name: "last;", linkageName: "last", scope: !12, file: !12, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!27 = !DILocation(line: 83, scope: !28, inlinedAt: !30)
!28 = distinct !DISubprogram(name: "<;", linkageName: "<", scope: !29, file: !29, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!29 = !DIFile(filename: "int.jl", directory: ".")
!30 = !DILocation(line: 378, scope: !31, inlinedAt: !13)
!31 = distinct !DISubprogram(name: ">;", linkageName: ">", scope: !32, file: !32, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!32 = !DIFile(filename: "operators.jl", directory: ".")
!33 = !{!34, !34, i64 0}
!34 = !{!"jtbaa_const", !35, i64 0}
!35 = !{!"jtbaa", !36, i64 0}
!36 = !{!"jtbaa"}
!37 = !DILocation(line: 471, scope: !18, inlinedAt: !20)
!38 = !DILocation(line: 473, scope: !39, inlinedAt: !41)
!39 = distinct !DISubprogram(name: "==;", linkageName: "==", scope: !40, file: !40, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!40 = !DIFile(filename: "promotion.jl", directory: ".")
!41 = !DILocation(line: 884, scope: !16, inlinedAt: !42)
!42 = !DILocation(line: 475, scope: !18, inlinedAt: !20)
!43 = !DILocation(line: 3, scope: !5)
!44 = !DILocation(line: 473, scope: !39, inlinedAt: !45)
!45 = !DILocation(line: 42, scope: !46, inlinedAt: !48)
!46 = distinct !DISubprogram(name: "iszero;", linkageName: "iszero", scope: !47, file: !47, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!47 = !DIFile(filename: "number.jl", directory: ".")
!48 = !DILocation(line: 133, scope: !49, inlinedAt: !50)
!49 = distinct !DISubprogram(name: "iseven;", linkageName: "iseven", scope: !29, file: !29, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!50 = !DILocation(line: 0, scope: !51, inlinedAt: !53)
!51 = distinct !DISubprogram(name: "#10;", linkageName: "#10", scope: !52, file: !52, type: !6, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!52 = !DIFile(filename: "none", directory: ".")
!53 = !DILocation(line: 472, scope: !18, inlinedAt: !20)
!54 = !{!55, !55, i64 0}
!55 = !{!"jtbaa_gcframe", !35, i64 0}
!56 = !{i64 1}
!57 = !{i64 16}
!58 = !{i64 8}

On LLVM 12 https://github.com/JuliaLang/llvm-project/releases/tag/julia-12.0.1-4

[vchuravy@satori-login-001 jl]$ usr/tools/lli --version
LLVM (http://llvm.org/):
  LLVM version 12.0.1jl
  Optimized build.
  Default target: powerpc64le-linux-gnu
  Host CPU: pwr9
[vchuravy@satori-login-001 jl]$ usr/tools/lli test.ll; echo $?
1

On LLVM 13 https://github.com/JuliaLang/llvm-project/releases/tag/julia-13.0.0-3

[vchuravy@satori-login-001 jl]$ usr/tools/lli test.ll; echo $?
0
[vchuravy@satori-login-001 jl]$ usr/tools/lli --version
LLVM (http://llvm.org/):
  LLVM version 13.0.0jl
  Optimized build.
  Default target: powerpc64le-linux-gnu
  Host CPU: pwr9
nemanjai commented 2 years ago

This appears to have been fixed by 34badc409cc452575c538c4b6449546adc38f121. Namely, this bug appears to just be https://bugs.llvm.org/show_bug.cgi?id=51714