Open atbug opened 4 years ago
Can you check the output with Profile.print(C=true)
? Sometimes backtraces get truncated, which can lead to misleading effects in the counts.
This time, the output of @time
is
2.325471 seconds (40 allocations: 305.177 MiB, 8.40% gc time)
1.798511 seconds (20 allocations: 152.589 MiB, 0.29% gc time)
and the Profile.print(C=true, mincount=20)
gives
347 ...ldworker/worker/package_linux64/build/src/task.c:596; start_task
347 ...dworker/worker/package_linux64/build/src/julia.h:1614; jl_apply
347 /buildworker/worker/package_linux64/build/src/gf.c:2191; jl_apply_generic
347 unknown stackframe
347 ./task.jl:268; (::getfield(Atom, Symbol("##19#21")){Array{Any,1}})()
347 ...ker/worker/package_linux64/build/src/builtins.c:563; jl_f__apply
347 ...orker/worker/package_linux64/build/src/julia.h:1614; jl_apply
347 ...dworker/worker/package_linux64/build/src/gf.c:2191; jl_apply_generic
347 unknown stackframe
347 .../wangc/.julia/packages/Atom/X8fAI/src/comm.jl:164; handlemsg(::Dict{String,Any}, ::Dict{String,Any})
347 ...worker/worker/package_linux64/build/src/gf.c:2191; jl_apply_generic
347 ...c/.julia/packages/Revise/439di/src/Revise.jl:1123; (::getfield(Revise, Symbol("##89#90")){getfield(Atom, Symbol("##124...
347 ...wangc/.julia/packages/Atom/X8fAI/src/eval.jl:86; (::getfield(Atom, Symbol("##124#129")))(::Dict{String,Any})
347 ...c/.julia/packages/Media/ItEPc/src/dynamic.jl:24; macro expansion
347 ...angc/.julia/packages/Atom/X8fAI/src/eval.jl:91; macro expansion
347 ...rker/worker/package_linux64/build/src/gf.c:2191; jl_apply_generic
347 ...angc/.julia/packages/Atom/X8fAI/src/repl.jl:85; hideprompt(::getfield(Atom, Symbol("##125#130")){String,Int64,Stri...
347 ...ngc/.julia/packages/Atom/X8fAI/src/eval.jl:92; #125
347 ./logging.jl:491; with_logger
347 ./logging.jl:395; with_logstate(::getfield(Atom, Symbol("##126#131")){String,Int6...
347 ...gc/.julia/packages/Atom/X8fAI/src/eval.jl:93; #126
347 ...gc/.julia/packages/Atom/X8fAI/src/eval.jl:47; withpath
347 ...ia/packages/CodeTools/xGemk/src/utils.jl:30; withpath(::getfield(Atom, Symbol("##127#132")){String,Int64,St...
347 ...c/.julia/packages/Atom/X8fAI/src/eval.jl:94; (::getfield(Atom, Symbol("##127#132")){String,Int64,String,Bo...
347 ...r/worker/package_linux64/build/src/gf.c:2191; jl_apply_generic
347 unknown stackframe
347 ...ia/packages/CodeTools/xGemk/src/eval.jl:30; include_string(::Module, ::String, ::String, ::Int64)
347 ./loading.jl:1064; include_string(::Module, ::String, ::String)
347 ...orker/package_linux64/build/src/ast.c:873; jl_parse_eval_all
347 .../package_linux64/build/src/toplevel.c:815; jl_toplevel_eval_flex
347 ...ckage_linux64/build/src/interpreter.c:893; jl_interpret_toplevel_thunk
347 unknown stackframe
347 unknown stackframe
347 unknown stackframe
347 ...age_linux64/build/src/interpreter.c:884; jl_interpret_toplevel_thunk_callback
347 ...age_linux64/build/src/interpreter.c:699; eval_body
347 ...ge_linux64/build/src/interpreter.c:635; eval_body
347 ...ge_linux64/build/src/interpreter.c:411; eval_value
347 ...e_linux64/build/src/interpreter.c:323; do_call
347 ...er/package_linux64/build/src/gf.c:2191; jl_apply_generic
285 ...rkspace/Hop.jl/playground/perm.jl:22; foobar(::Array{Complex{Float64},2}, ::Array{Complex...
285 ./util.jl:156; macro expansion
285 ...kspace/Hop.jl/playground/perm.jl:10; foo(::Array{Complex{Float64},2}, ::Array{Complex{F...
285 ./operators.jl:529; *
125 ./arraymath.jl:52; *
125 ./broadcast.jl:752; broadcast(::typeof(*), ::Float64, ::Array{Comple...
96 ./boot.jl:406; materialize
96 ...kage_linux64/build/src/array.c:427; jl_alloc_array_2d
96 ...kage_linux64/build/src/array.c:160; _new_array
96 ...age_linux64/build/src/array.c:112; _new_array_
96 ...64/build/src/julia_internal.h:274; jl_gc_alloc_
96 ...ckage_linux64/build/src/gc.c:1111; jl_gc_pool_alloc
96 ...ckage_linux64/build/src/gc.c:2912; jl_gc_collect
80 ...kage_linux64/build/src/gc.c:2729; _jl_gc_collect
39 ...kage_linux64/build/src/gc.c:1999; gc_mark_loop
30 ...kage_linux64/build/src/gc.c:1732; gc_mark_scan_obj8
29 ./broadcast.jl:798; materialize
29 ./broadcast.jl:818; copy
29 ./broadcast.jl:842; copyto!
29 ./broadcast.jl:887; copyto!
25 ./simdloop.jl:77; macro expansion
25 ./broadcast.jl:888; macro expansion
20 ./broadcast.jl:558; getindex
20 ./broadcast.jl:598; _broadcast_getindex
20 ./broadcast.jl:625; _broadcast_getindex_evalf
20 ./complex.jl:312; *
20 ./float.jl:399; *
100 ./boot.jl:406; *
100 ...ckage_linux64/build/src/array.c:427; jl_alloc_array_2d
100 ...kage_linux64/build/src/array.c:160; _new_array
100 ...kage_linux64/build/src/array.c:112; _new_array_
100 ...64/build/src/julia_internal.h:274; jl_gc_alloc_
100 ...ackage_linux64/build/src/gc.c:1111; jl_gc_pool_alloc
100 ...ckage_linux64/build/src/gc.c:2912; jl_gc_collect
68 ...ckage_linux64/build/src/gc.c:2729; _jl_gc_collect
26 ...ckage_linux64/build/src/gc.c:1999; gc_mark_loop
22 ...kage_linux64/build/src/gc.c:1732; gc_mark_scan_obj8
60 ...1.2/LinearAlgebra/src/matmul.jl:143; *
60 ...1.2/LinearAlgebra/src/matmul.jl:145; mul!
60 ...1.2/LinearAlgebra/src/matmul.jl:464; gemm_wrapper!(::Array{Complex{Float64},2}, ::Cha...
60 ...v1.2/LinearAlgebra/src/blas.jl:1131; gemm!(::Char, ::Char, ::Complex{Float64}, ::Ar...
60 .../../lib/julia/libopenblas64_.so:?; zgemm_64_
60 ...../lib/julia/libopenblas64_.so:?; zgemm_nn
21 ...../lib/julia/libopenblas64_.so:?; zgemm_itcopy_HASWELL
39 ...../lib/julia/libopenblas64_.so:?; zgemm_oncopy_HASWELL
62 ...rkspace/Hop.jl/playground/perm.jl:23; foobar(::Array{Complex{Float64},2}, ::Array{Complex...
62 ./util.jl:156; macro expansion
62 ...rkspace/Hop.jl/playground/perm.jl:17; bar
62 ...v1.2/LinearAlgebra/src/matmul.jl:143; *
57 ...1.2/LinearAlgebra/src/matmul.jl:145; mul!
57 ...1.2/LinearAlgebra/src/matmul.jl:464; gemm_wrapper!(::Array{Complex{Float64},2}, ::Cha...
57 ...v1.2/LinearAlgebra/src/blas.jl:1131; gemm!(::Char, ::Char, ::Complex{Float64}, ::Arr...
57 .../../lib/julia/libopenblas64_.so:?; zgemm_64_
57 .../../lib/julia/libopenblas64_.so:?; zgemm_nn
38 ...../lib/julia/libopenblas64_.so:?; zgemm_oncopy_HASWELL
57 ...s/julia-1.2.0/bin/../lib/julia/libopenblas64_.so:?; zgemm_beta_HASWELL
3720 ...s/julia-1.2.0/bin/../lib/julia/libopenblas64_.so:?; zgemm_kernel_n_HASWELL
A little bit messy. foo
takes 285 and bar
takes 62, if read this thing right.
Actually, that last lines says they took almost 2k (about 2 seconds), but had slightly different setup times (approximately 250 ms vs 50 ms)
Actually, that last lines says they took almost 2k (about 2 seconds), but had slightly different setup times (approximately 250 ms vs 50 ms)
Sorry, I don't understand how you get this number, the last line is 3720. Did you divide by two assuming both functions take roughly the same time in blas?
yes
Is there any workaround for this? It is not really possible to manually assign these BLAS calls to different functions in a complex program.
OoenBlas doesn't provide correct unwind information for their assembly kernels. It's something that could be fixed, but nobody has tried yet.
I posted the issue on discourse (https://discourse.julialang.org/t/inaccurate-sampling-of-profiler/30030) but was not getting any attention. Since this is possibly a bug, I report it here with a better illustration:
The
@time
macro reports the execution time forfoo
andbar
isbut the output of
Profile.print
is insteadwhich clearly indicates that the function
foo
is taking roughly twice the time ofbar
. Therefore the two method of performance measurement is inconsistent.Edit: I realized I rebuilt system image and was using revise, but I can reproduce the issue above with the official binary and without revise in REPL.