JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.43k stars 5.45k forks source link

Strange type parameterization issue introduced around 30 June (or perhaps 16 June) #12063

Closed sbromberger closed 9 years ago

sbromberger commented 9 years ago

My LightGraphs.jl code started failing tests around 6/30. Before that date, it was working. The problem is in this code:

type MinCutVisitor{T} <: AbstractMASVisitor
  graph::SimpleGraph
  parities::AbstractArray{Bool,1}
  colormap::Vector{Int}
  bestweight::T
  cutweight::T
  visited::Integer
  distmx::AbstractArray{T, 2}
  vertices::Vector{Int}
end

function MinCutVisitor{T}(graph::SimpleGraph, distmx::AbstractArray{T, 2})
  n = nv(graph)
  MinCutVisitor(
    graph,
    falses(n),
    zeros(Int,n),
    typemax(T),
    zero(T),
    zero(Int),
    distmx,
    @compat(Vector{Int}())
)
end
...
function mincut{T}(
  graph::SimpleGraph,
  distmx::AbstractArray{T, 2}
 )

  visitor = MinCutVisitor(graph, distmx)     # <------ failure is here
  colormap = zeros(Int, nv(graph))

  traverse_graph(graph, T, MaximumAdjacency(), 1, visitor, colormap)

  return( visitor.parities, visitor.bestweight)
end

with this test:

g = Graph(8)

# Original example by Stoer

wedges = [
    (1, 2, 2.),
    (1, 5, 3.),
    (2, 3, 3.),
    (2, 5, 2.),
    (2, 6, 2.),
    (3, 4, 4.),
    (3, 7, 2.),
    (4, 7, 2.),
    (4, 8, 2.),
    (5, 6, 3.),
    (6, 7, 1.),
    (7, 8, 3.) ]

m = length(wedges)
eweights = spzeros(nv(g),nv(g))

for (s, d, w) in wedges
    add_edge!(g, s, d)
    eweights[s, d] = w
    eweights[d, s] = w
end
...
parity, bestcut = mincut(g, eweights)  # <------ failure is here

with this error:

ERROR: LoadError: LoadError: MethodError: `convert` has no method matching convert(::Type{LightGraphs.MinCutVisitor{T}}, ::LightGraphs.Graph, ::BitArray{1}, ::Array{Int64,1}, ::Float64, ::Float64, ::Int64, ::Base.SparseMatrix.SparseMatrixCSC{Float64,Int64}, ::Array{Int64,1})
This may have arisen from a call to the constructor LightGraphs.MinCutVisitor{T}(...),
since type constructors fall back to convert methods.
Closest candidates are:
  LightGraphs.MinCutVisitor{T}(::Union{LightGraphs.DiGraph,LightGraphs.Graph}, ::AbstractArray{Bool,1}, ::Array{Int64,1}, ::T, ::T, ::Integer, ::AbstractArray{T,2}, ::Array{Int64,1})
  LightGraphs.MinCutVisitor{T}(::Union{LightGraphs.DiGraph,LightGraphs.Graph}, ::AbstractArray{T,2})
  call{T}(::Type{T}, ::Any)

Now, the strange part: if I change the outer constructor in one of two ways, the tests pass:

1) if I explicitly call the inner constructor with {T}, it works 2) if I get rid of the {T} parameterization in the outer constructor definition (and change AbstractArray{T,2} to AbstractArray), it works.

The first one is confusing because I don't know why I'd have to pass the type to the inner constructor.

The second one is particularly confusing since I reference T in the code of the outer constructor. Yet the tests pass.

Even more confusing: if the distmx is a sparse array of Integers (instead of Floats as in the test), it works without any modification.

In any case, more details are here: https://groups.google.com/d/msg/julia-users/uAntZ-eJ7kY/Heqq32Lhr3EJ

sbromberger commented 9 years ago

More data showing the failure with floats but not with ints:

julia> LightGraphs.MinCutVisitor(g,spzeros(Float64,8,8))
ERROR: MethodError: `convert` has no method matching convert(::Type{LightGraphs.MinCutVisitor{T}}, ::LightGraphs.Graph, ::BitArray{1}, ::Array{Int64,1}, ::Float64, ::Float64, ::Int64, ::Base.SparseMatrix.SparseMatrixCSC{Float64,Int64}, ::Array{Int64,1})
This may have arisen from a call to the constructor LightGraphs.MinCutVisitor{T}(...),
since type constructors fall back to convert methods.
Closest candidates are:
  LightGraphs.MinCutVisitor{T}(::Union{LightGraphs.Graph,LightGraphs.DiGraph}, ::AbstractArray{Bool,1}, ::Array{Int64,1}, ::T, ::T, ::Integer, ::AbstractArray{T,2}, ::Array{Int64,1})
  LightGraphs.MinCutVisitor{T}(::Union{LightGraphs.Graph,LightGraphs.DiGraph}, ::AbstractArray{T,2})
  call{T}(::Type{T}, ::Any)
  ...
 in call at /Users/seth/.julia/v0.4/LightGraphs/src/maxadjvisit.jl:99

julia> LightGraphs.MinCutVisitor(g,spzeros(Int,8,8))
LightGraphs.MinCutVisitor{Int64}({8, 0} undirected graph,Bool[false,false,false,false,false,false,false,false],[0,0,0,0,0,0,0,0],9223372036854775807,0,0,8x8 sparse matrix with 0 Int64 entries:,Int64[])
rened commented 9 years ago

perhaps related to / the same as #11587?

vtjnash commented 9 years ago

maybe the inliner is having trouble correctly uniquing the T sparam? does it help if you use a variable other than T for one of the type-parameters / sparams?

sbromberger commented 9 years ago

I tried S instead of T without any effect - same error. I'll try a git bisect.

sbromberger commented 9 years ago

fecadcbd2c8fcf914a391da160dd2560981bc2f4

fecadcbd2c8fcf914a391da160dd2560981bc2f4 is the first bad commit
commit fecadcbd2c8fcf914a391da160dd2560981bc2f4
Author: Jeff Bezanson <jeff.bezanson@gmail.com>
Date:   Tue Jun 16 13:23:46 2015 -0400

    fix order of type cache insertion and UID assignment

    reinstates and hopefully fixes the problem in #11606

:040000 040000 b241a825bacfb026a114a0b41fc6bde8df5eaf90 404b49068cb7010c2c8fe8eae9aad049e8c3cf85 M  src
:040000 040000 a5be542ea69fbc8efd652e56b3c1b634dbf20eca 02089ac8ff5557630898c7188c8d1c8734ec8059 M  test
sbromberger commented 9 years ago

cc @JeffBezanson for advice :)

Also ref #11606 from the commit msg.

yuyichao commented 9 years ago

How exactly do you run the tests. I've just run the LightGraphs tests with no errors (two deprecation warnings)

In general, it's better if you could provide a full script to reproduce your error (rather than pieces of it). Would be even better if you can reduce you test cases.

sbromberger commented 9 years ago

Pkg.test("LightGraphs") will do it. You'll see the error when it gets to running .../.julia/v0.4/LightGraphs/test/maxadjvisit.jl ...

The full test script is in test/maxadjvisit.jl and the line that's causing the error is in the first comment of this issue. A reduced case is available in the google thread. I didn't want to copy everything over, but here it is:

g = DiGraph(8)
LightGraphs.MinCutVisitor(g,spzeros(Float64,8,8))
yuyichao commented 9 years ago

Pkg.test("LightGraphs") will do it.

Yeah. No errors.

sbromberger commented 9 years ago

@yuyichao On what version?

yuyichao commented 9 years ago
julia> versioninfo()
Julia Version 0.4.0-dev+5841
Commit f428392* (2015-07-07 14:58 UTC)
Platform Info:
  System: Linux (x86_64-unknown-linux-gnu)
  CPU: Intel(R) Core(TM) i7-4702HQ CPU @ 2.20GHz
  WORD_SIZE: 64
  BLAS: libopenblas (DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas
  LIBM: libm
  LLVM: libLLVM-3.6.1
sbromberger commented 9 years ago

It's definitely happening here (see pkg.julialang.org for confirmation, and I can confirm it as well):

Julia Version 0.4.0-dev+5846
Commit 0503f2a (2015-07-08 02:04 UTC)
Platform Info:
  System: Linux (x86_64-unknown-linux-gnu)
  CPU: Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Nehalem)
  LAPACK: libopenblas
  LIBM: libopenlibm
  LLVM: libLLVM-3.3

Also here:

julia> using LightGraphs

julia> g = DiGraph(8)
{8, 0} directed graph

julia> LightGraphs.MinCutVisitor(g,spzeros(Float64,8,8))
ERROR: MethodError: `convert` has no method matching convert(::Type{LightGraphs.MinCutVisitor{T}}, ::LightGraphs.DiGraph, ::BitArray{1}, ::Array{Int64,1}, ::Float64, ::Float64, ::Int64, ::Base.SparseMatrix.SparseMatrixCSC{Float64,Int64}, ::Array{Int64,1})
This may have arisen from a call to the constructor LightGraphs.MinCutVisitor{T}(...),
since type constructors fall back to convert methods.
Closest candidates are:
  LightGraphs.MinCutVisitor{T}(::Union{LightGraphs.DiGraph,LightGraphs.Graph}, ::AbstractArray{Bool,1}, ::Array{Int64,1}, ::T, ::T, ::Integer, ::AbstractArray{T,2}, ::Array{Int64,1})
  LightGraphs.MinCutVisitor{T}(::Union{LightGraphs.DiGraph,LightGraphs.Graph}, ::AbstractArray{T,2})
  call{T}(::Type{T}, ::Any)
  ...
 in call at /Users/seth/.julia/v0.4/LightGraphs/src/maxadjvisit.jl:100

julia> versioninfo()
Julia Version 0.4.0-dev+5413
Commit fecadcb (2015-06-16 17:23 UTC)
Platform Info:
  System: Darwin (x86_64-apple-darwin14.4.0)
  CPU: Intel(R) Core(TM) i7-5557U CPU @ 3.10GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas
  LIBM: libopenlibm
  LLVM: libLLVM-3.3
sbromberger commented 9 years ago

Just compiled the latest, same issue:

julia> LightGraphs.MinCutVisitor(g,spzeros(Float64,8,8))
ERROR: MethodError: `convert` has no method matching convert(::Type{LightGraphs.MinCutVisitor{T}}, ::LightGraphs.DiGraph, ::BitArray{1}, ::Array{Int64,1}, ::Float64, ::Float64, ::Int64, ::Base.SparseMatrix.SparseMatrixCSC{Float64,Int64}, ::Array{Int64,1})
This may have arisen from a call to the constructor LightGraphs.MinCutVisitor{T}(...),
since type constructors fall back to convert methods.
Closest candidates are:
  LightGraphs.MinCutVisitor{T}(::Union{LightGraphs.DiGraph,LightGraphs.Graph}, ::AbstractArray{Bool,1}, ::Array{Int64,1}, ::T, ::T, ::Integer, ::AbstractArray{T,2}, ::Array{Int64,1})
  LightGraphs.MinCutVisitor{T}(::Union{LightGraphs.DiGraph,LightGraphs.Graph}, ::AbstractArray{T,2})
  call{T}(::Type{T}, ::Any)
  ...
 in call at /Users/seth/.julia/v0.4/LightGraphs/src/maxadjvisit.jl:100

julia> versioninfo()
Julia Version 0.4.0-dev+5860
Commit 7fa43ed (2015-07-08 20:57 UTC)
Platform Info:
  System: Darwin (x86_64-apple-darwin14.4.0)
  CPU: Intel(R) Core(TM) i7-5557U CPU @ 3.10GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas
  LIBM: libopenlibm
  LLVM: libLLVM-3.3
sbromberger commented 9 years ago

Appears to be an issue with sparse matrices? because

julia> LightGraphs.MinCutVisitor(g,zeros(Float64,8,8))  # note zeros() instead of spzeros()
LightGraphs.MinCutVisitor{Float64}({8, 0} directed graph,Bool[false,false,false,false,false,false,false,false],[0,0,0,0,0,0,0,0],Inf,0.0,0,8x8 Array{Float64,2}:
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0,Int64[])

works.

sbromberger commented 9 years ago

also cc @IainNZ since this originally came from pkg.julialang.org's test results :)

IainNZ commented 9 years ago

I reproduced this with LightGraphs v0.1.15 on OSX build Version 0.4.0-dev+5824 (2015-07-07 03:33 UTC), Commit eb58c97* (1 day old master).

yuyichao commented 9 years ago

So somehow this is OSX only?

IainNZ commented 9 years ago

Nope, it happened on the Ubuntu VM that PkgEval runs in (which is on an Arch host, as if that'd matter). So its definitely reproducible and platform independent (to an extent).

IainNZ commented 9 years ago

One commonality between the PkgEval machine/env and my machine is that they are not source builds by me, they're coming from the magical buildbots (PkgEval is generic linux binaries, I'm using whatever homebrew gives out).

yuyichao commented 9 years ago

Inspired by https://github.com/JuliaLang/julia/issues/12059 and @vtjnash 's https://github.com/JuliaLang/julia/issues/12063#issuecomment-119666130 can you try if --inline=no makes a difference

sbromberger commented 9 years ago

running julia with --inline=no did not make a difference. Same error.

sbromberger commented 9 years ago

@IainNZ my two julia instances are source builds. Version 0.4.0-dev+5860 (2015-07-08 20:57 UTC) Commit 7fa43ed (0 days old master) - which fails, and Version 0.4.0-dev+5008 (2015-05-26 16:08 UTC) Commit 0855ec9 (43 days old master), which works, were both built from source.

sbromberger commented 9 years ago

In the event this turns out to be a local code issue, the changes to the files in question around that date are here: https://github.com/JuliaGraphs/LightGraphs.jl/commit/a2eee8a5b57cb963ee63db96e180bca91e1ace6a

sbromberger commented 9 years ago

@yuyichao could the difference in llvm versions be causing a problem here?

yuyichao commented 9 years ago

I was hoping that it shouldn't but I can give it a try... (just didn't want to spend an hour compiling llvm if someone else could figure out the issue otherwise ;P )

sbromberger commented 9 years ago

@yuyichao if you wouldn't mind trying it and posting your results, I'd be grateful. Thanks.

yuyichao commented 9 years ago

Hmm. Somehow I couldn't reproduce with

Julia Version 0.4.0-dev+5860
Commit 7fa43ed* (2015-07-08 20:57 UTC)

but I can now reproduce it with the latest master...

Julia Version 0.4.0-dev+5923
Commit f5c46f8* (2015-07-11 04:03 UTC)
yuyichao commented 9 years ago

Reduced the repro to

type M{T}
    graph
    parities::AbstractArray{Bool,1}
    colormap::Vector{Int}
    bestweight::T
    cutweight::T
    visited::Integer
    distmx::AbstractArray{T, 2}
    vertices::Vector{Int}
end

function f(n, g, eweights)
    parities = falses(n)
    M(g,
      parities,
      zeros(Int, n),
      typemax(Float64),
      zero(Float64),
      zero(Int),
      eweights,
      Int[])
end

n = 8
g = []

eweights = spzeros(n, n)
f(n, g, eweights)

Independent of LightGraphs.jl

This works for me on 7fa43ed but not on f5c46f8

sbromberger commented 9 years ago

I'm glad to hear I'm not crazy (or perhaps we all are, but even that's better than being the only one).

yuyichao commented 9 years ago
type M{T}
    graph
    parities
    colormap
    bestweight::T
    cutweight::T
    visited
    distmx::AbstractArray{T, 2}
    vertices
end

function f(eweights, vertices)
    M(0,
      0,
      0,
      0.0,
      0.0,
      0,
      eweights,
      vertices)
end

eweights = spzeros(0, 0)
vertices = Int[]
# vertices = [] # Replace the previous line with this one suppressed the error
println(@code_typed f(eweights, vertices))
# println()
# println(@which M(0, 0, 0, 0.0, 0.0, 0, eweights, vertices))

f(eweights, vertices)

Somehow type inference decide that the fallback method of call is the one it should use. Sounds similar to https://github.com/JuliaLang/julia/issues/12092 ?

sbromberger commented 9 years ago

So if I'm reading correctly, this may be fixed as of yesterday? (Pulling a new version and building now.)

yuyichao commented 9 years ago

For me it happens only after #12092 is fixed ....

sbromberger commented 9 years ago

that's so weird, since we have documented proof that it was broken before #12092 (see pkg.julialang.org test history for LightGraphs.jl).

sbromberger commented 9 years ago

Still failing for me with Version 0.4.0-dev+5923 (2015-07-11 04:03 UTC) Commit f5c46f8 (0 days old master)

yuyichao commented 9 years ago
type M
end

f2{T}(::Type{T}, args...) = 2
f2{T}(::Type{M}, g, p, c, b::T, cu::T, v, d::AbstractArray{T, 2}, ve) = 1

function f(eweights, vertices)
    f2(M,
       0,
       0,
       0,
       0.0,
       0.0,
       0,
       eweights,
       vertices)
end

eweights = spzeros(0, 0)
vertices = Int[]

println(f2(M, 0, 0, 0, 0.0, 0.0, 0, eweights, vertices))
println()
println(methods(f2))
println()

@assert f(eweights, vertices) == 1

Output:

1

# 2 methods for generic function "f2":
f2{T}(::Type{M}, g, p, c, b::T, cu::T, v, d::AbstractArray{T,2}, ve) at /home/yuyichao/projects/contrib/LightGraphs.jl/test/maxadjvisit.jl:7
f2{T}(::Type{T}, args...) at /home/yuyichao/projects/contrib/LightGraphs.jl/test/maxadjvisit.jl:6

ERROR: LoadError: AssertionError: f(eweights,vertices) == 1
 in include_from_node1 at ./loading.jl:133
while loading /home/yuyichao/projects/contrib/LightGraphs.jl/test/maxadjvisit.jl, in expression starting on line 29
sbromberger commented 9 years ago

@yuyichao that fails for me in 0-day old master, but works in my 45-day old master (consistent with previous tests).

yuyichao commented 9 years ago

Yeah. Just tries to get rid of call since there's too many methods.

@sbromberger Can you check if the version I said is working for me also works for you? https://github.com/JuliaLang/julia/issues/12063#issuecomment-120579606

Might help to bisect (if Jeff is not faster).

Probably cc. @JeffBezanson .

sbromberger commented 9 years ago

@yuyichao I know it fails with Julia Version 0.4.0-dev+5923 Commit f5c46f8* (2015-07-11 04:03 UTC) - let me build Julia Version 0.4.0-dev+5860 Commit 7fa43ed* (2015-07-08 20:57 UTC).

sbromberger commented 9 years ago

Just to be clear - I don't have to make clean or anything after I switch branches / revs, right? A "make" should do what I need?

yuyichao commented 9 years ago

Just to be clear - I don't have to make clean or anything after I switch branches / revs, right? A "make" should do what I need?

Shouldn't need to.

sbromberger commented 9 years ago

Version 0.4.0-dev+5860 (2015-07-08 20:57 UTC) Commit 7fa43ed (2 days old master)

julia> println(f2(M, 0, 0, 0, 0.0, 0.0, 0, eweights, vertices))
1

julia> println()

julia> println(methods(f2))
# 2 methods for generic function "f2":
f2{T}(::Type{M}, g, p, c, b::T, cu::T, v, d::AbstractArray{T,2}, ve) at none:1
f2{T}(::Type{T}, args...) at none:1

julia> println()

julia> @assert f(eweights, vertices) == 1
ERROR: AssertionError: f(eweights,vertices) == 1
sbromberger commented 9 years ago

PS: your assertion error is strange. Why is it asserting on LightGraphs when you're not using it?

yuyichao commented 9 years ago

--inline=no suppresses the error for my last example for me....

yuyichao commented 9 years ago

PS: your assertion error is strange. Why is it asserting on LightGraphs when you're not using it?

I'm not asserting on LightGraphs?

sbromberger commented 9 years ago

--inline=no suppresses for me as well on Version 0.4.0-dev+5860 (2015-07-08 20:57 UTC) Commit 7fa43ed (2 days old master)

and your assertion error above:

ERROR: LoadError: AssertionError: f(eweights,vertices) == 1
 in include_from_node1 at ./loading.jl:133
while loading /home/yuyichao/projects/contrib/LightGraphs.jl/test/maxadjvisit.jl, in expression starting on line 29

(Oh, your test code is in maxadjvisit.jl, I guess. I've been copying and pasting into the REPL.)

yuyichao commented 9 years ago

If you are talking about the path.

Well, I'm just editting the tests inplace to reduce it...

yuyichao commented 9 years ago

And what I pasted above is the content of my LightGraphs.jl/test/maxadjvisit.jl. Don't worry, you won't see it in a PR. =)

sbromberger commented 9 years ago

So, can we say with some degree of assurance that

1) the problem is not LightGraphs code, and 2) we've narrowed down the possible causes?

I'm a bit lost at this point. How can I help?

sbromberger commented 9 years ago

Also, see my bisect results above: https://github.com/JuliaLang/julia/issues/12063#issuecomment-119745241

yuyichao commented 9 years ago
  1. I think the bug is not in LightGraphs.
  2. I don't think I understand the cause but the repro I had above should be small enough so that ppl know more about type inference can try to reproduce the issue and fix it.
  3. I still don't really understand I cannot reproduce it before today.