Open CarloLucibello opened 2 years ago
Same issue here.
I encountered a similar problem. In my case the EOFError is due to the uncompressed file size exceeding 4 Gb.
When saving the same file from python as an uncompressed .npz the size is around 4.9 Gb. Trying to read the file like that from NPZ.jl leads to another error:
ERROR: LoadError: invalid file header
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:33
[2] _getfiles(io::IOStream, diroffset::UInt32, nfiles::UInt16)
@ ZipFile ~/.julia/packages/ZipFile/evaHP/src/ZipFile.jl:309
[3] Reader
@ ~/.julia/packages/ZipFile/evaHP/src/ZipFile.jl:106 [inlined]
[4] ZipFile.Reader(filename::String)
@ ZipFile ~/.julia/packages/ZipFile/evaHP/src/ZipFile.jl:118
[5] npzread(::String)
@ NPZ ~/.julia/packages/NPZ/UizJj/src/NPZ.jl:294
Judging from this issue in ZipFiles, ZIP64 and therefore files bigger than 4Gb uncompressed aren't supported at all.
Would be nice if NPZ.jl could check this in advance before calling ZipFiles. If not, I'm just leaving this information here for other people wondering, why their file couldn't be read.
Same issue here. My filesize is 22mb
ERROR: LoadError: EOFError: read end of file
Stacktrace:
[1] _read(::ZipFile.ReadableFile, ::Array{UInt8,1}) at /u/racball/.julia/packages/ZipFile/fdYkP/src/ZipFile.jl:519
[2] read! at /u/racball/.julia/packages/ZipFile/fdYkP/src/ZipFile.jl:483 [inlined]
[3] readheader(::ZipFile.ReadableFile) at /u/racball/.julia/packages/NPZ/UizJj/src/NPZ.jl:210
[4] npzreadarray(::ZipFile.ReadableFile) at /u/racball/.julia/packages/NPZ/UizJj/src/NPZ.jl:243
[5] (::NPZ.var"#4#7")(::ZipFile.ReadableFile) at ./none:0
[6] iterate at ./generator.jl:47 [inlined]
[7] _all(::Base.var"#239#241", ::Base.Generator{Base.Iterators.Filter{NPZ.var"#5#8"{Array{String,1}},Array{ZipFile.ReadableFile,1}},NPZ.var"#4#7"}, ::Colon) at ./reduce.jl:819
[8] all at ./reduce.jl:815 [inlined]
[9] Dict(::Base.Generator{Base.Iterators.Filter{NPZ.var"#5#8"{Array{String,1}},Array{ZipFile.ReadableFile,1}},NPZ.var"#4#7"}) at ./dict.jl:130
[10] npzread at /u/racball/.julia/packages/NPZ/UizJj/src/NPZ.jl:311 [inlined] (repeats 2 times)
[11] npzread(::String) at /u/racball/.julia/packages/NPZ/UizJj/src/NPZ.jl:295
[12] top-level scope at /nfs/nfs9/home/nobackup/racball/github/graphzip/exps/20211103_juliasparse.jl:3
[13] include(::Module, ::String) at ./Base.jl:377
[14] exec_options(::Base.JLOptions) at ./client.jl:288
[15] _start() at ./client.jl:484
in expression starting at /nfs/nfs9/home/nobackup/racball/github/graphzip/exps/20211103_juliasparse.jl:3
caused by [exception 1]
parsing header failed: unsupported type S3
Stacktrace:
[1] error(::String) at ./error.jl:33
[2] parsedtype at /u/racball/.julia/packages/NPZ/UizJj/src/NPZ.jl:157 [inlined]
[3] parseheader(::SubString{String}) at /u/racball/.julia/packages/NPZ/UizJj/src/NPZ.jl:185
[4] readheader(::ZipFile.ReadableFile) at /u/racball/.julia/packages/NPZ/UizJj/src/NPZ.jl:226
[5] npzreadarray(::ZipFile.ReadableFile) at /u/racball/.julia/packages/NPZ/UizJj/src/NPZ.jl:243
[6] (::NPZ.var"#4#7")(::ZipFile.ReadableFile) at ./none:0
[7] iterate at ./generator.jl:47 [inlined]
[8] grow_to!(::Dict{String,Array{Int32,1}}, ::Base.Generator{Base.Iterators.Filter{NPZ.var"#5#8"{Array{String,1}},Array{ZipFile.ReadableFile,1}},NPZ.var"#4#7"}, ::Int64) at ./dict.jl:161
[9] grow_to!(::Dict{Any,Any}, ::Base.Generator{Base.Iterators.Filter{NPZ.var"#5#8"{Array{String,1}},Array{ZipFile.ReadableFile,1}},NPZ.var"#4#7"}) at ./dict.jl:144
[10] dict_with_eltype at ./abstractdict.jl:549 [inlined]
[11] Dict(::Base.Generator{Base.Iterators.Filter{NPZ.var"#5#8"{Array{String,1}},Array{ZipFile.ReadableFile,1}},NPZ.var"#4#7"}) at ./dict.jl:128
[12] npzread at /u/racball/.julia/packages/NPZ/UizJj/src/NPZ.jl:311 [inlined] (repeats 2 times)
[13] npzread(::String) at /u/racball/.julia/packages/NPZ/UizJj/src/NPZ.jl:295
[14] top-level scope at /nfs/nfs9/home/nobackup/racball/github/graphzip/exps/20211103_juliasparse.jl:3
[15] include(::Module, ::String) at ./Base.jl:377
[16] exec_options(::Base.JLOptions) at ./client.jl:288
[17] _start() at ./client.jl:484
I observed the same problem, while reading a compressed npz file of 314M
ERROR: not a numpy array file
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:33
[2] readheader(f::ZipFile.ReadableFile)
@ NPZ ~/.julia/packages/NPZ/UizJj/src/NPZ.jl:212
[3] npzreadarray(f::ZipFile.ReadableFile)
@ NPZ ~/.julia/packages/NPZ/UizJj/src/NPZ.jl:243
[4] (::NPZ.var"#4#7")(f::ZipFile.ReadableFile)
@ NPZ ./none:0
[5] _all(f::Base.var"#318#320", itr::Base.Generator{Base.Iterators.Filter{NPZ.var"#5#8"{Vector{String}}, Vector{ZipFile.ReadableFile}}, NPZ.var"#4#7"}, #unused#::Colon)
@ Base ./reduce.jl:0
[6] all
@ ./reduce.jl:1152 [inlined]
[7] Dict(kv::Base.Generator{Base.Iterators.Filter{NPZ.var"#5#8"{Vector{String}}, Vector{ZipFile.ReadableFile}}, NPZ.var"#4#7"})
@ Base ./dict.jl:131
[8] npzread (repeats 2 times)
@ ~/.julia/packages/NPZ/UizJj/src/NPZ.jl:311 [inlined]
[9] npzread(::String)
@ NPZ ~/.julia/packages/NPZ/UizJj/src/NPZ.jl:295
[10] top-level scope
@ REPL[3]:1
caused by: EOFError: read end of file
Stacktrace:
[1] _read(f::ZipFile.ReadableFile, a::Matrix{Int64})
@ ZipFile ~/.julia/packages/ZipFile/evaHP/src/ZipFile.jl:519
[2] read!
@ ~/.julia/packages/ZipFile/evaHP/src/ZipFile.jl:484 [inlined]
[3] _npzreadarray(f::ZipFile.ReadableFile, hdr::NPZ.Header{Int64, 2, typeof(ltoh)})
@ NPZ ~/.julia/packages/NPZ/UizJj/src/NPZ.jl:234
[4] npzreadarray(f::ZipFile.ReadableFile)
@ NPZ ~/.julia/packages/NPZ/UizJj/src/NPZ.jl:244
[5] (::NPZ.var"#4#7")(f::ZipFile.ReadableFile)
@ NPZ ./none:0
[6] iterate
@ ./generator.jl:47 [inlined]
[7] grow_to!(dest::Dict{Any, Any}, itr::Base.Generator{Base.Iterators.Filter{NPZ.var"#5#8"{Vector{String}}, Vector{ZipFile.ReadableFile}}, NPZ.var"#4#7"})
@ Base ./dict.jl:140
[8] dict_with_eltype
@ ./abstractdict.jl:547 [inlined]
[9] Dict(kv::Base.Generator{Base.Iterators.Filter{NPZ.var"#5#8"{Vector{String}}, Vector{ZipFile.ReadableFile}}, NPZ.var"#4#7"})
@ Base ./dict.jl:129
[10] npzread (repeats 2 times)
@ ~/.julia/packages/NPZ/UizJj/src/NPZ.jl:311 [inlined]
[11] npzread(::String)
@ NPZ ~/.julia/packages/NPZ/UizJj/src/NPZ.jl:295
[12] top-level scope
@ REPL[3]:1
So is there any solution?
Here a reproducible example
The downloaded file is correctly read from python