JuliaData / Feather.jl

Read and write feather files in pure Julia
https://juliadata.github.io/Feather.jl/stable
Other
109 stars 27 forks source link

`InexactError` when reading overwritten files; other bizarre behavior #26

Closed ExpandingMan closed 6 years ago

ExpandingMan commented 8 years ago

Currently some rather strange things are happening when reading from feather files that have been overwritten. Unfortunately, I am having a very hard time consistently reproducing the errors, but a few times I got an InexactError, and I had a bizarre issue where DateTime was wrong (but only when read from a file that was overwritten).

Is Feather.write supposed to be doing some sort of updating? Otherwise wouldn't it make sense to just force the file to be entirely deleted before it is written to again? Alternatively I suppose this may be a bug in DataStreams.

I'll post back here when I am able to consistently reproduce one of these errors.

quinnj commented 8 years ago

Do let me know what you find, but Feather.write should, as you mentioned, be completely over-writing an existing file. Are you trying to go back and forth between python? I know there is currently an open bug with python feather's datetime handling (can't handle non-nanosecond datetime values).

ExpandingMan commented 8 years ago

No, this was not a python issue. Perhaps it would just be safer to do something like isfile(filename) && rm(filename) before it is written. It may be more expensive but it ensures safety.

ExpandingMan commented 8 years ago

Ok, whatever is happening might have something to do with when the version of Feather.jl is changed. I recently upgraded to 0.2.1, then I overwrote and re-read a file. Then, when I tried to view the dataframe, I got the following segfault

signal (11): Segmentation fault
while loading no file, in expression starting on line 0
show at /home/savastio/.julia/v0.5/WeakRefStrings/src/WeakRefStrings.jl:36
show at ./nullable.jl:35
showcompact at ./show.jl:1642 [inlined]
ourshowcompact at /home/savastio/.julia/v0.5/DataFrames/src/abstractdataframe/show.jl:64
ourstrwidth at /home/savastio/.julia/v0.5/DataFrames/src/abstractdataframe/show.jl:37
unknown function (ip: 0x7fbf81bbae82)
jl_call_method_internal at /build/julia-0ecDfF/julia-0.5.0/src/julia_internal.h:189 [inlined]
jl_apply_generic at /build/julia-0ecDfF/julia-0.5.0/src/gf.c:1942
getmaxwidths at /home/savastio/.julia/v0.5/DataFrames/src/abstractdataframe/show.jl:114
unknown function (ip: 0x7fbf81bb8182)
jl_call_method_internal at /build/julia-0ecDfF/julia-0.5.0/src/julia_internal.h:189 [inlined]
jl_apply_generic at /build/julia-0ecDfF/julia-0.5.0/src/gf.c:1942
show at /home/savastio/.julia/v0.5/DataFrames/src/abstractdataframe/show.jl:448
unknown function (ip: 0x7fbf81bb6bed)
jl_call_method_internal at /build/julia-0ecDfF/julia-0.5.0/src/julia_internal.h:189 [inlined]
jl_apply_generic at /build/julia-0ecDfF/julia-0.5.0/src/gf.c:1942
display at ./REPL.jl:132
unknown function (ip: 0x7fbf81bb6496)
jl_call_method_internal at /build/julia-0ecDfF/julia-0.5.0/src/julia_internal.h:189 [inlined]
jl_apply_generic at /build/julia-0ecDfF/julia-0.5.0/src/gf.c:1942
display at ./REPL.jl:135
unknown function (ip: 0x7fbf81bb6176)
jl_call_method_internal at /build/julia-0ecDfF/julia-0.5.0/src/julia_internal.h:189 [inlined]
jl_apply_generic at /build/julia-0ecDfF/julia-0.5.0/src/gf.c:1942
display at ./multimedia.jl:143
unknown function (ip: 0x7fbf81bb5fb2)
jl_call_method_internal at /build/julia-0ecDfF/julia-0.5.0/src/julia_internal.h:189 [inlined]
jl_apply_generic at /build/julia-0ecDfF/julia-0.5.0/src/gf.c:1942
print_response at ./REPL.jl:154
unknown function (ip: 0x7fbf81bb5a28)
jl_call_method_internal at /build/julia-0ecDfF/julia-0.5.0/src/julia_internal.h:189 [inlined]
jl_apply_generic at /build/julia-0ecDfF/julia-0.5.0/src/gf.c:1942
print_response at ./REPL.jl:139
unknown function (ip: 0x7fbf81bb54a8)
jl_call_method_internal at /build/julia-0ecDfF/julia-0.5.0/src/julia_internal.h:189 [inlined]
jl_apply_generic at /build/julia-0ecDfF/julia-0.5.0/src/gf.c:1942
#22 at ./REPL.jl:652
unknown function (ip: 0x7fbf81b55491)
jl_call_method_internal at /build/julia-0ecDfF/julia-0.5.0/src/julia_internal.h:189 [inlined]
jl_apply_generic at /build/julia-0ecDfF/julia-0.5.0/src/gf.c:1942
#22 at ./REPL.jl:652
unknown function (ip: 0x7fbf81b55491)
jl_call_method_internal at /build/julia-0ecDfF/julia-0.5.0/src/julia_internal.h:189 [inlined]
jl_apply_generic at /build/julia-0ecDfF/julia-0.5.0/src/gf.c:1942
run_interface at ./LineEdit.jl:1579
unknown function (ip: 0x7fc1c333f52f)
jl_call_method_internal at /build/julia-0ecDfF/julia-0.5.0/src/julia_internal.h:189 [inlined]
jl_apply_generic at /build/julia-0ecDfF/julia-0.5.0/src/gf.c:1942
run_frontend at ./REPL.jl:903
run_repl at ./REPL.jl:188
unknown function (ip: 0x7fbf81b50712)
jl_call_method_internal at /build/julia-0ecDfF/julia-0.5.0/src/julia_internal.h:189 [inlined]
jl_apply_generic at /build/julia-0ecDfF/julia-0.5.0/src/gf.c:1942
_start at ./client.jl:360
unknown function (ip: 0x7fc1c335a708)
jl_call_method_internal at /build/julia-0ecDfF/julia-0.5.0/src/julia_internal.h:189 [inlined]
jl_apply_generic at /build/julia-0ecDfF/julia-0.5.0/src/gf.c:1942
unknown function (ip: 0x40185c)
unknown function (ip: 0x4012f6)
__libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x401348)
Allocations: 32132396 (Pool: 32131200; Big: 1196); GC: 30
[1]    126280 segmentation fault (core dumped)  julia

Then, once I deleted the file, everything is fine and I can no longer reproduce the error. Note that this is a completely different error than I got originally (it was not a segfault before) so this may be completely unrelated to the original problem.

ExpandingMan commented 8 years ago

This just seems to get more and more confusing. Sometimes when reading an overwritten file I get this

ERROR: invalid Array dimensions
 in string(::WeakRefStrings.WeakRefString{UInt8}) at /home/savastio/.julia/v0.5/WeakRefStrings/src/WeakRefStrings.jl:66
 in convert at /home/savastio/.julia/v0.5/WeakRefStrings/src/WeakRefStrings.jl:65 [inlined]
 in setindex! at ./array.jl:415 [inlined]
 in copy!(::Base.LinearFast, ::Array{String,1}, ::Base.LinearFast, ::Array{WeakRefString{UInt8},1}) at ./abstractarray.jl:559
 in convert(::Type{Array{String,1}}, ::NullableArrays.NullableArray{WeakRefString{UInt8},1}) at /home/savastio/.julia/v0.5/NullableArrays/src/primitives.jl:250
 in convert(::Type{Array{String,N}}, ::NullableArrays.NullableArray{WeakRefString{UInt8},1}) at /home/savastio/.julia/v0.5/NullableArrays/src/primitives.jl:256
 in featherRead(::String) at /home/savastio/.julia/v0.5/DatasToolbox/src/dfutils.jl:515

Obviously it seems at this point that there are several unrelated problems. All I know is that I have never gotten any errors when reading from a freshly written file. Frustratingly I have not been able to consistently reproduce any of these.

Oh, by the way featherRead is just a wrapper function I wrote which converts dataframe columns to be of the NullableArray type, and converts all WeakRefStrings to Strings.

quinnj commented 8 years ago

@ExpandingMan, I think you're running into https://github.com/wesm/feather/issues/249

ExpandingMan commented 8 years ago

It seems likely. I'll watch for their update and let you know if I still see the issue. Thanks.

ExpandingMan commented 6 years ago

It seems pretty safe to say that whatever this issue was, it's gone in current master.