Open lkuper opened 8 years ago
So far I haven't succeeded in reproducing this. On 0.4.6, I tried
Pkg.update()
using ParallelAccelerator
ParallelAccelerator.embed()
# restart julia
include("/home/jeff/.julia/v0.4/ParallelAccelerator/examples/gaussian-blur/gaussian-blur.jl")
Should I Pkg.checkout("ParallelAccelerator")
?
@JeffBezanson Yes, try Pkg.checkout("ParallelAccelerator"); Pkg.checkout("CompilerTools")
.
Just reproduced it again with: ParallelAccelerator at https://github.com/IntelLabs/ParallelAccelerator.jl/commit/1a5cbddf48194694b5ac3901ff0da846c4e56fdd CompilerTools at https://github.com/IntelLabs/CompilerTools.jl/commit/0eb30bc2ab252122aeb7ba0f35cb3419397d9eac Julia at 0.4.6
This time I got "failed to precompile Images" as well as Colors:
Pkg.update()
using ParallelAccelerator
importall ParallelAccelerator
ParallelAccelerator.embed()
# restart julia
include("/home/lkuper/.julia/v0.4/ParallelAccelerator/examples/gaussian-blur/gaussian-blur.jl")
# recompiling .ji files; lots of warning messages about incremental compliation being broken
ERROR: LoadError: LoadError: error compiling convert: error compiling cnvt: error compiling invert_rgb_compand: Unsupported Float Size
in include at ./boot.jl:261
in include_from_node1 at ./loading.jl:320
in include at ./boot.jl:261
in include_from_node1 at ./loading.jl:320
[inlined code] from none:2
in anonymous at no file:0
in process_options at ./client.jl:257
in _start at ./client.jl:378
while loading /home/lkuper/.julia/v0.4/Colors/src/algorithms.jl, in expression starting on line 66
while loading /home/lkuper/.julia/v0.4/Colors/src/Colors.jl, in expression starting on line 28
ERROR: LoadError: Failed to precompile Colors to /home/lkuper/.julia/lib/v0.4/Colors.ji
in error at ./error.jl:21
in compilecache at loading.jl:400
in require at ./loading.jl:240
in include at ./boot.jl:261
in include_from_node1 at ./loading.jl:320
[inlined code] from none:2
in anonymous at no file:0
in process_options at ./client.jl:257
in _start at ./client.jl:378
while loading /home/lkuper/.julia/v0.4/Images/src/Images.jl, in expression starting on line 25
ERROR: LoadError: Failed to precompile Images to /home/lkuper/.julia/lib/v0.4/Images.ji
in error at ./error.jl:21
in compilecache at loading.jl:400
in recompile_stale at loading.jl:476
in _require_from_serialized at loading.jl:83
in _require_from_serialized at ./loading.jl:109
in require at ./loading.jl:235
in include at ./boot.jl:261
in include_from_node1 at ./loading.jl:320
while loading /home/lkuper/.julia/v0.4/ParallelAccelerator/examples/gaussian-blur/gaussian-blur.jl, in expression starting on line 28
Here's a more minimal repro, at least:
julia> using FixedPointNumbers
julia> ufixed8(0.5)+0.5
ERROR: error compiling +: Unsupported Float Size
julia> @code_typed ufixed8(0.5)+0.5
1-element Array{Any,1}:
:($(Expr(:lambda, Any[:x,:y], Any[Any[Any[:x,FixedPointNumbers.UFixed{UInt8,8},0],Any[:y,Float64,0]],Any[],Any[],Any[]], :(begin # promotion.jl, line 167:
return (Base.box)(Base.Float64,(Base.add_float)(x::FixedPointNumbers.UFixed{UInt8,8},(Base.box)(Float64,(Base.sitofp)(Float64,y::Float64))))
end::Float64))))
Superficially, it looks like it's trying to use code optimized for Float64 + Int
for UFixed8 + Float64
arguments.
Good, thanks. I see the same behavior. Why would including PA into the system image cause this?
I don't know yet; it certainly shouldn't cause it. It gets weirder:
julia> m = @which 1 + false
+(x::Number, y::Number) at promotion.jl:167
julia> m.func.code
AST(:($(Expr(:lambda, Any[:x,:y], Any[Any[Any[:x,Float64,0],Any[:y,Int64,0]],Any[],Any[],Any[]], :(begin # promotion.jl, line 167:
return (Base.box)(Base.Float64,(Base.add_float)(x::Float64,(Base.box)(Float64,(Base.sitofp)(Float64,y::Int64))))
end::Float64)))))
It looks like the original definition of this method of +
has been overwritten.
@ninegua Do you know what's going on here?
The problem looked like caused by importall
in embed()
. So I tweaked it a bit further in this commit 437e892, now we get over the above problems, but run into something new. This also seem to only affect stencil workloads. With Latest Julia 0.4-release branch, running gaussian-blur seems to get into an infinite loops of errors:
$ julia gaussian-blur.jl
INFO: Recompiling stale cache file /home/hliu54/PSE/julia_pkgs/lib/v0.4/Images.ji for module Images.
WARNING: eval from module J2CArray to Main:
Expr(:block, # line 69 /home/hliu54/PSE/julia_pkgs/v0.4/ParallelAccelerator/src/j2c-array.jl, Expr(:function, Expr(:call, :j2c_array_size, Expr(:::, :arr, Expr(:curly, :Ptr, :Void)::Any)::Any, Expr(:::, :dim, :Int)::Any)::Any, Expr(:b
lock, # line 70 /home/hliu54/PSE/julia_pkgs/v0.4/ParallelAccelerator/src/j2c-array.jl, :l = Expr(:ccall, Expr(:tuple, Expr(:quote, :j2c_array_size)::Any, "/home/hliu54/PSE/julia_pkgs/v0.4/ParallelAccelerator/src/../deps/libj2carray.so
.1.0")::Any, :Cuint, Expr(:tuple, Expr(:curly, :Ptr, :Void)::Any, :Cuint)::Any, :arr, Expr(:call, :convert, :Cuint, :dim)::Any)::Any, # line 72 /home/hliu54/PSE/julia_pkgs/v0.4/ParallelAccelerator/src/j2c-array.jl, Expr(:return, Expr(
:call, :convert, :Int, :l)::Any)::Any)::Any)::Any, # line 77 /home/hliu54/PSE/julia_pkgs/v0.4/ParallelAccelerator/src/j2c-array.jl, Expr(:function, Expr(:call, :j2c_array_to_pointer, Expr(:::, :arr, Expr(:curly, :Ptr, :Void)::Any)::An
y, Expr(:::, :own, :Bool)::Any)::Any, Expr(:block, # line 78 /home/hliu54/PSE/julia_pkgs/v0.4/ParallelAccelerator/src/j2c-array.jl, Expr(:ccall, Expr(:tuple, Expr(:quote, :j2c_array_to_pointer)::Any, "/home/hliu54/PSE/julia_pkgs/v0.4/
ParallelAccelerator/src/../deps/libj2carray.so.1.0")::Any, Expr(:curly, :Ptr, :Void)::Any, Expr(:tuple, Expr(:curly, :Ptr, :Void)::Any, :Bool)::Any, :arr, :own)::Any)::Any)::Any, # line 84 /home/hliu54/PSE/julia_pkgs/v0.4/ParallelAcce
lerator/src/j2c-array.jl, Expr(:function, Expr(:call, :j2c_array_get, Expr(:::, :arr, Expr(:curly, :Ptr, :Void)::Any)::Any, Expr(:::, :idx, :Int)::Any, Expr(:::, :T, :Type)::Any)::Any, Expr(:block, # line 85 /home/hliu54/PSE/julia_pkg
s/v0.4/ParallelAccelerator/src/j2c-array.jl, :nbytes = Expr(:if, Expr(:call, :is, :T, Expr(:curly, :Ptr, :Void)::Any)::Any, 0, Expr(:call, :sizeof, :T)::Any)::Any, # line 86 /home/hliu54/PSE/julia_pkgs/v0.4/ParallelAccelerator/src/j2c
-array.jl, :_value = Expr(:call, :Array, :T, 1)::Any, # line 87 /home/hliu54/PSE/julia_pkgs/v0.4/ParallelAccelerator/src/j2c-array.jl, Expr(:ccall, Expr(:tuple, Expr(:quote, :j2c_array_get)::Any, "/home/hliu54/PSE/julia_pkgs/v0.4/Para
llelAccelerator/src/../deps/libj2carray.so.1.0")::Any, :Void, Expr(:tuple, :Cint, Expr(:curly, :Ptr, :Void)::Any, :Cuint, Expr(:curly, :Ptr, :Void)::Any)::Any, Expr(:call, :convert, :Cint, :nbytes)::Any, :arr, Expr(:call, :convert, :C
uint, :idx)::Any, Expr(:call, :convert, Expr(:curly, :Ptr, :Void)::Any, Expr(:call, :pointer, :_value)::Any)::Any)::Any, # line 89 /home/hliu54/PSE/julia_pkgs/v0.4/ParallelAccelerator/src/j2c-array.jl, Expr(:return, Expr(:ref, :_value
, 1)::Any)::Any)::Any)::Any, # line 94 /home/hliu54/PSE/julia_pkgs/v0.4/ParallelAccelerator/src/j2c-array.jl, Expr(:function, Expr(:call, Expr(:curly, :j2c_array_set, :T)::Any, Expr(:::, :arr, Expr(:curly, :Ptr, :Void)::Any)::Any, Exp
r(:::, :idx, :Int)::Any, Expr(:::, :value, :T)::Any)::Any, Expr(:block, # line 95 /home/hliu54/PSE/julia_pkgs/v0.4/ParallelAccelerator/src/j2c-array.jl, :nbytes = Expr(:if, Expr(:call, :is, :T, Expr(:curly, :Ptr, :Void)::Any)::Any, 0,
Expr(:call, :sizeof, :T)::Any)::Any, # line 96 /home/hliu54/PSE/julia_pkgs/v0.4/ParallelAccelerator/src/j2c-array.jl, :_value = Expr(:if, Expr(:comparison, :nbytes, :==, 0)::Any, :value, Expr(:call, :convert, Expr(:curly, :Ptr, :Void
)::Any, Expr(:call, :pointer, Expr(:ref, :T, :value)::Any)::Any)::Any)::Any, # line 97 /home/hliu54/PSE/julia_pkgs/v0.4/ParallelAccelerator/src/j2c-array.jl, Expr(:ccall, Expr(:tuple, Expr(:quote, :j2c_array_set)::Any, "/home/hliu54/P
SE/julia_pkgs/v0.4/ParallelAccelerator/src/../deps/libj2carray.so.1.0")::Any, :Void, Expr(:tuple, :Cint, Expr(:curly, :Ptr, :Void)::Any, :Cuint, Expr(:curly, :Ptr, :Void)::Any)::Any, Expr(:call, :convert, :Cint, :nbytes)::Any, :arr, E
xpr(:call, :convert, :Cuint, :idx)::Any, :_value)::Any)::Any)::Any, # line 107 /home/hliu54/PSE/julia_pkgs/v0.4/ParallelAccelerator/src/j2c-array.jl, Expr(:function, Expr(:call, :j2c_array_delete, Expr(:::, :arr, Expr(:curly, :Ptr, :V
oid)::Any)::Any)::Any, Expr(:block, # line 108 /home/hliu54/PSE/julia_pkgs/v0.4/ParallelAccelerator/src/j2c-array.jl, Expr(:ccall, Expr(:tuple, Expr(:quote, :j2c_array_delete)::Any, "/home/hliu54/PSE/julia_pkgs/v0.4/ParallelAccelerato
r/src/../deps/libj2carray.so.1.0")::Any, :Void, Expr(:tuple, Expr(:curly, :Ptr, :Void)::Any)::Any, :arr)::Any)::Any)::Any, # line 115 /home/hliu54/PSE/julia_pkgs/v0.4/ParallelAccelerator/src/j2c-array.jl, Expr(:function, Expr(:call, :
j2c_array_deref, Expr(:::, :arr, Expr(:curly, :Ptr, :Void)::Any)::Any)::Any, Expr(:block, # line 116 /home/hliu54/PSE/julia_pkgs/v0.4/ParallelAccelerator/src/j2c-array.jl, Expr(:ccall, Expr(:tuple, Expr(:quote, :j2c_array_deref)::Any,
"/home/hliu54/PSE/julia_pkgs/v0.4/ParallelAccelerator/src/../deps/libj2carray.so.1.0")::Any, :Void, Expr(:tuple, Expr(:curly, :Ptr, :Void)::Any)::Any, :arr)::Any)::Any)::Any, # line 120 /home/hliu54/PSE/julia_pkgs/v0.4/ParallelAccele
rator/src/j2c-array.jl, Expr(:function, Expr(:call, :to_j2c_array, Expr(:::, :inp, :AbstractString)::Any, :ptr_array_dict, :mapAtypeKey, :j2c_array_new)::Any, Expr(:block, # line 121 /home/hliu54/PSE/julia_pkgs/v0.4/ParallelAccelerato
r/src/j2c-array.jl, :arr = Expr(:call, :to_j2c_array, Expr(:., :inp, :data)::Any, :ptr_array_dict, :mapAtypeKey, :j2c_array_new)::Any, # line 122 /home/hliu54/PSE/julia_pkgs/v0.4/ParallelAccelerator/src/j2c-array.jl, Expr(:ccall, Expr
(:tuple, Expr(:quote, :new_ascii_string)::Any, "/home/hliu54/PSE/julia_pkgs/v0.4/ParallelAccelerator/src/../deps/libj2carray.so.1.0")::Any, Expr(:curly, :Ptr, :Void)::Any, Expr(:tuple, Expr(:curly, :Ptr, :Void)::Any)::Any, :arr)::Any)
::Any)::Any, # line 126 /home/hliu54/PSE/julia_pkgs/v0.4/ParallelAccelerator/src/j2c-array.jl, Expr(:function, Expr(:call, :from_ascii_string, Expr(:::, :str, Expr(:curly, :Ptr, :Void)::Any)::Any, :ptr_array_dict)::Any, Expr(:block, #
line 127 /home/hliu54/PSE/julia_pkgs/v0.4/ParallelAccelerator/src/j2c-array.jl, :data = Expr(:ccall, Expr(:tuple, Expr(:quote, :from_ascii_string)::Any, "/home/hliu54/PSE/julia_pkgs/v0.4/ParallelAccelerator/src/../deps/libj2carray.so
.1.0")::Any, Expr(:curly, :Ptr, :Void)::Any, Expr(:tuple, Expr(:curly, :Ptr, :Void)::Any)::Any, :str)::Any, # line 128 /home/hliu54/PSE/julia_pkgs/v0.4/ParallelAccelerator/src/j2c-array.jl, :arr = Expr(:call, :_from_j2c_array, :data,
:UInt8, 1, :ptr_array_dict)::Any, # line 129 /home/hliu54/PSE/julia_pkgs/v0.4/ParallelAccelerator/src/j2c-array.jl, Expr(:ccall, Expr(:tuple, Expr(:quote, :delete_ascii_string)::Any, "/home/hliu54/PSE/julia_pkgs/v0.4/ParallelAccelerat
or/src/../deps/libj2carray.so.1.0")::Any, :Void, Expr(:tuple, Expr(:curly, :Ptr, :Void)::Any)::Any, :str)::Any, # line 130 /home/hliu54/PSE/julia_pkgs/v0.4/ParallelAccelerator/src/j2c-array.jl, Expr(:return, :arr)::Any)::Any)::Any)::A
ny
** incremental compilation may be broken for this module **
....
It just repeats
Other non-stencil tests and workloads work fine.
@ninegua Thanks. Your fix seems to work fine for gaussian-blur against the 0.4.6 release. I haven't tried against the latest 0.4-release branch yet -- building it now. I'll see if I can reproduce the looping behavior.
@ninegua It also works fine for me (after getting through a bunch of "eval from module J2CArray to Main" warnings) on the latest thing on the 0.4-release branch (Version 0.4.7-pre+3). So I think I can close this.
@JeffBezanson So apparently the issue was that we were using importall
in userimg.jl, and Paul's change here fixed it. Something to keep in mind for the next time someone reports an issue like this.
Reopening since we're seeing more bugs in code that doesn't use ParallelAccelerator that only manifest when our userimg.jl is present. In particular, our plain Julia implementation of optical flow produces a BoundsError only when the PA userimg.jl is present. I'll try to come up with a minimal example to reproduce the problem.
Just making a note that the aforementioned issue with the plain Julia optical flow is still happening when our userimg.jl is present. Here's the error:
ERROR: LoadError: BoundsError: attempt to access 28x50x2 Array{Float32,3}:
[:, :, 1] =
NaN NaN NaN NaN NaN NaN NaN NaN … NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN … NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
⋮ ⋮ ⋱ ⋮
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN … NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN … NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
[:, :, 2] =
NaN NaN NaN NaN NaN NaN NaN NaN … NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN … NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
⋮ ⋮ ⋱ ⋮
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN … NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN … NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
at index [2,51,1]
in interpolate at /mnt/home/lkuper/pse-hpc/benchmarks2/opt-flow/src/hsof-port.jl:163
in interpolateFlow at /mnt/home/lkuper/pse-hpc/benchmarks2/opt-flow/src/hsof-port.jl:198
in multiScaleOpticalFlow at /mnt/home/lkuper/pse-hpc/benchmarks2/opt-flow/src/hsof-port.jl:368
in main at /mnt/home/lkuper/pse-hpc/benchmarks2/opt-flow/src/hsof-port.jl:424
in include at ./boot.jl:261
in include_from_node1 at ./loading.jl:320
in process_options at ./client.jl:280
in _start at ./client.jl:378
while loading /mnt/home/lkuper/pse-hpc/benchmarks2/opt-flow/src/hsof-port.jl, in expression starting on line 431
Still happening as of October 1, same error as above.
As of recently, using
ParallelAccelerator.embed()
causes several of our examples to break; not only the PA versions we distribute but also the plain Julia versions of the examples that we use internally for benchmarking. Here are the problems I've noticed so far (on Julia 0.4.6, with ParallelAccelerator and CompilerTools updated to master as of yesterday):For gaussian-blur:
Same thing happens for the plain Julia version, and same thing happens for harris, in both plain and PA versions, as well as for the unreleased mri-recon example, in both plain and PA versions.
For opt-flow:
for the PA version. For our plain Julia version, it's the same problem with a shorter stack trace:
When I
rm base/userimg.jl && make clean && make
, all of these problems go away. However, without the userimg.jl hack, compile times are considerably worse than they were a few months ago (79s for opt-flow, for example). No good. :(Is userimg.jl somehow interacting badly with other packages that have been precompiled to
.ji
files? It's possible that all this amounts to some classic unsolvable separate compilation problem. I hope I'm wrong and it's actually something boring and easy to fix.