JuliaDatabases / ODBC.jl

An ODBC interface for the Julia programming language
https://odbc.juliadatabases.org/stable
Other
106 stars 63 forks source link

BoundsError when loading >256 character varchar fields from Amazon Athena #240

Closed MegaByte closed 4 years ago

MegaByte commented 5 years ago

Similar to #226, but not fixed for the Amazon Athena connections. Attempting to select from varchar fields results in the following error if the returned data is larger than 256 characters:

BoundsError: attempt to access 256-element Array{UInt8,1} at index [1:259]

Stacktrace:
 [1] throw_boundserror(::Array{UInt8,1}, ::Tuple{UnitRange{Int64}}) at ./abstractarray.jl:484
 [2] checkbounds at ./abstractarray.jl:449 [inlined]
 [3] getindex(::Array{UInt8,1}, ::UnitRange{Int64}) at ./array.jl:735
 [4] cast!(::Type{Union{Missing, String}}, ::ODBC.Query{missing,NamedTuple{(:metadata,),Tuple{Union{Missing, String}}},Tuple{Array{Union{Missing, String},1}}}, ::Int64) at /Users/aaron/.julia/packages/ODBC/P6nfr/src/Query.jl:256
 [5] (::getfield(ODBC, Symbol("##10#12")))(::Int64) at /Users/aaron/.julia/packages/ODBC/P6nfr/src/Query.jl:122
 [6] foreach(::getfield(ODBC, Symbol("##10#12")), ::UnitRange{Int64}) at ./abstractarray.jl:1866
 [7] ODBC.Query(::ODBC.DSN, ::String) at /Users/aaron/.julia/packages/ODBC/P6nfr/src/Query.jl:122
 [8] #query#15(::Bool, ::Bool, ::Dict{Int64,Function}, ::Function, ::ODBC.DSN, ::String, ::Type{DataFrame}) at /Users/aaron/.julia/packages/ODBC/P6nfr/src/Query.jl:390
 [9] query at /Users/aaron/.julia/packages/ODBC/P6nfr/src/Query.jl:385 [inlined]
 [10] query(::ODBC.DSN, ::String) at /Users/aaron/.julia/packages/ODBC/P6nfr/src/Query.jl:376
 [11] top-level scope at In[52]:1
quinnj commented 4 years ago

It would be worth seeing if this is an issue on master after the big rewrite.

I could see this issue happening if the "column size" was returned as 256, but when we actually fetched the column, it had a length > 256 (in the example you posted, 259). That would be an annoying/bad move by the athena odbc driver, but we could perhaps find a work-around (like checking that we indeed have enough room before decoding, or re-allocating or something).

quinnj commented 4 years ago

Screen Shot 2020-05-28 at 9 06 06 PM So one thing I noticed while testing the athena driver locally is that the default setting in the driver is to not allow/fetch more than 256 characters for string columns. This is adjustable in the driver configuration, so I'm going to close this for now unless someone sees another issue.