JuliaML / MLDataUtils.jl

Utility package for generating, loading, splitting, and processing Machine Learning datasets
http://mldatautilsjl.readthedocs.io/
Other
102 stars 20 forks source link

Invalid columns error #43

Open cpfiffer opened 5 years ago

cpfiffer commented 5 years ago

I'm not sure if this is a dataframe error or a MLDataUtils error. Using the below MWE:

using MLDataUtils
using RDatasets

data = RDatasets.dataset("ISLR", "Default");

train, test = MLDataUtils.splitobs(data, at = 0.05);

Gives the error:

BoundsError: attempt to access "invalid columns 1:500 selected"

Stacktrace:
 [1] Type at /home/cameron/.julia/packages/DataFrames/lyCjP/src/other/index.jl:252 [inlined]
 [2] Type at /home/cameron/.julia/packages/DataFrames/lyCjP/src/subdataframe/subdataframe.jl:41 [inlined]
 [3] Type at /home/cameron/.julia/packages/DataFrames/lyCjP/src/subdataframe/subdataframe.jl:43 [inlined]
 [4] view at /home/cameron/.julia/packages/DataFrames/lyCjP/src/subdataframe/subdataframe.jl:87 [inlined]
 [5] view at /home/cameron/.julia/packages/DataFrames/lyCjP/src/subdataframe/subdataframe.jl:82 [inlined]
 [6] datasubset(::DataFrame, ::UnitRange{Int64}, ::LearnBase.ObsDim.Undefined) at /home/cameron/.julia/packages/MLDataUtils/Onazx/src/datapattern.jl:10
 [7] splitobs at /home/cameron/.julia/packages/MLDataPattern/mX21p/src/splitobs.jl:121 [inlined]
 [8] #splitobs#59 at /home/cameron/.julia/packages/MLDataPattern/mX21p/src/splitobs.jl:113 [inlined]
 [9] (::getfield(MLDataPattern, Symbol("#kw##splitobs")))(::NamedTuple{(:at,),Tuple{Float64}}, ::typeof(splitobs), ::DataFrame) at ./none:0
 [10] top-level scope at In[23]:5

Package status:

  [875e7ca2] AltDistributions v0.2.0
  [c52e3926] Atom v0.7.14
  [76274a88] Bijectors v0.2.6
  [336ed68f] CSV v0.4.3
  [5ae59095] Colors v0.9.5
  [a3dee88c] DandelionWebSockets v0.2.0
  [a93c6f00] DataFrames v0.17.0
  [9a8bc11e] DataStreams v0.4.1
  [b552c78f] DiffRules v0.0.7
  [31c24e10] Distributions v0.16.4
  [e30172f5] Documenter v0.21.0
  [997ab1e6] DocumenterMarkdown v0.2.0
  [bbc10e6e] DynamicHMC v1.0.2
  [e25cca7e] FDM v0.3.0
  [587475ba] Flux v0.7.1
  [f6369f11] ForwardDiff v0.10.2
  [38e38edf] GLM v1.0.2
  [28b8d3ca] GR v0.37.0
  [c91e804a] Gadfly v1.0.1
  [f67ccb44] HDF5 v0.11.0
  [cd3eb016] HTTP v0.7.1
  [7073ff75] IJulia v1.15.2
  [c601a237] Interact v0.9.0
  [4138dd39] JLD v0.9.1
  [682c06a0] JSON v0.20.0
  [e5e0dc1b] Juno v0.5.4
  [b964fa9f] LaTeXStrings v1.0.3
  [194296ae] LibPQ v0.6.1
  [6f1fad26] Libtask v0.2.1+ [`~/.julia/dev/Libtask`]
  [98b081ad] Literate v1.0.2
  [6fdf6af0] LogDensityProblems v0.5.1
  [1671dc4f] MCMCChain v0.2.2+ [`~/.julia/dev/MCMCChain`]
  [cc2ba9b6] MLDataUtils v0.4.0
  [442fdcdd] Measures v0.3.0
  [73a701b4] NamedTuples v5.0.0
  [47be7bcc] ORCA v0.2.0
  [429524aa] Optim v0.17.2
  [90014a1f] PDMats v0.9.6
  [ccf2f8ad] PlotThemes v0.3.0
  [f0f68f2c] PlotlyJS v0.12.3
  [91a5bcdd] Plots v0.22.5
  [c46f51b8] ProfileView v0.4.0
  [438e738f] PyCall v1.18.5
  [d330b81b] PyPlot v2.7.0
  [ce6b1742] RDatasets v0.6.1
  [0376cc21] Reinforce v0.2.0
  [682df890] Stan v4.0.0
  [60ddc479] StatPlots v0.9.2
  [4c63d2b9] StatsFuns v0.7.0
  [84d833dd] TransformVariables v0.2.0+ #master (https://github.com/tpapp/TransformVariables.jl)
  [fce5fe82] Turing v0.6.6+ [`~/.julia/dev/Turing`]
  [44d3d7a6] Weave v0.6.2
  [104b5d7c] WebSockets v1.2.0
  [c2297ded] ZMQ v1.0.0