pluskid / Mocha.jl

Deep Learning framework for Julia
Other
1.29k stars 254 forks source link

Loading a snapshot fails with MethodError #202

Closed davidparks21 closed 8 years ago

davidparks21 commented 8 years ago

I'm trying to load a snapshot and continue training from that point, I ran a training of 600 iterations successfully, then tried running the same network setup again. The logs indicate that it found the correct snapshot, but I get an error in loading on haskey.

I've been looking into it for the past hour, but I'm not quite clear if the problem is with JLD or with something Mocha's code is doing before that point.

The warning text that is precursor to the error, not present in workspace; reconstructing is found in jld_types.jl, and seems to suggest that the datatype in the JLD file "isn't supported", though I'm not sure what that means.

I wondered if the error makes more immediate sense to you?

29-May 21:10:07:INFO:root:Network constructed!
29-May 21:10:07:DEBUG:root:#DEBUG Checking network topology for back-propagation
29-May 21:10:08:INFO:root:Loading existing model from mocha-net_basic_3L-dropout-160529_2019-gpu\snapshot-000600.jld
29-May 21:10:08:WARNING:root:type JLD.AssociativeWrapper{Core.AbstractString,Core.Array{Core.Array{TypeVar(:T),TypeVar(:N)},1},Base.Dict{Core.AbstractString,Core.Array{Core.Array{TypeVar(:T),TypeVar(:N)},1}}} not present in workspace; reconstructing
ERROR: LoadError: MethodError: `haskey` has no method matching haskey(::JLD.##JLD.AssociativeWrapper{Core.AbstractString,Core.Array{Core.Array{TypeVar(:T),TypeVar(:N)},1},Base.Dict{Core.AbstractString,Core.Array{Core.Array{TypeVar(:T),TypeVar(:N)},1}}}#14976, ::ASCIIString)
Closest candidates are:
  haskey(::Dict{K,V}, ::Any)
  haskey{K}(::WeakKeyDict{K,V}, ::Any)
  haskey(::Base.Collections.PriorityQueue{K,V,O<:Base.Order.Ordering}, ::Any)
  ...
 in load_network at d:\myprojects\julia\.julia\v0.4\Mocha\src\utils/io.jl:93
 in anonymous at d:\myprojects\julia\.julia\v0.4\Mocha\src\solvers.jl:158
 in jldopen at d:\myprojects\julia\.julia\v0.4\JLD\src\JLD.jl:256
 in load_snapshot at d:\myprojects\julia\.julia\v0.4\Mocha\src\solvers.jl:157
 in init_solve at d:\myprojects\julia\.julia\v0.4\Mocha\src\solvers.jl:184
 in solve at d:\myprojects\julia\.julia\v0.4\Mocha\src\solvers.jl:234
 in include at boot.jl:261
 in include_from_node1 at loading.jl:320
 in require at loading.jl:259
 in reload at loading.jl:211
while loading D:\MyProjects\mocha\network_definitions\net_basic_3L_Dropout.jl, in expression starting on line 74

Update (31May): This worked fine on my linux cluster, and I noticed it's using memory mapped IO in JLD. I recall there being a warning in the docs saying that memory map IO wasn't working on windows. So that's my best guess right now. I'll test further on windows and update this later, but at least it doesn't look like a systemic bug.

pluskid commented 8 years ago

Can you try to run the provided MNIST example? I tried to run it locally it could save and load snapshot without issue and I could not reproduce your error. I am not sure what that error mean, it might be that you are using a different version of JLD that does not haskey?

davidparks21 commented 8 years ago

I'm testing it further, I wasn't able to reproduce it on linux, I'm going to re-try again on windows.

davidparks21 commented 8 years ago

I haven't been able to reproduce this and I've tested it on windows and linux, so I probably had some other error that I wasn't correlating correctly. closing it.