Closed BuddhiLW closed 2 years ago
I was unable to run the tests locally, because I was not able to utilize R binary packages. But, using the dev version I needed, so I had DataFrames.jl v1.2.2, I had the following issue,
julia> smote(df[!,[:Age, :CreditScore],], df.Exited)
ERROR: ArgumentError: Cannot assign to non-existent column: 1
Stacktrace:
[1] insert_single_column!(df::DataFrame, v::Vector{Float64}, col_ind::Int64)
@ DataFrames ~/.julia/packages/DataFrames/vuMM8/src/dataframe/dataframe.jl:611
[2] setindex!
@ ~/.julia/packages/DataFrames/vuMM8/src/dataframe/dataframe.jl:628 [inlined]
[3] setindex!(df::DataFrame, v::Vector{Float64}, row_inds::Colon, col_ind::Int64)
@ DataFrames ~/.julia/packages/DataFrames/vuMM8/src/dataframe/dataframe.jl:679
[4] matrix_to_dataframe(X_new::Matrix{Float64}, dat::DataFrame, factor_indcs::Vector{Int64})
@ ClassImbalance ~/PP/MonitoriaEstatistica/Regression/dev/ClassImbalance/src/smote_exs.jl:28
[5] smote_obs(dat::DataFrame, pct::Int64, k::Int64, column_names::Vector{String})
@ ClassImbalance ~/PP/MonitoriaEstatistica/Regression/dev/ClassImbalance/src/smote_exs.jl:123
[6] _smote(X::DataFrame, y::Vector{Int64}, k::Int64, pct_over::Int64, pct_under::Int64)
@ ClassImbalance ~/PP/MonitoriaEstatistica/Regression/dev/ClassImbalance/src/ub_smote.jl:38
[7] #smote#10
@ ~/PP/MonitoriaEstatistica/Regression/dev/ClassImbalance/src/ub_smote.jl:89 [inlined]
[8] smote(X::DataFrame, y::Vector{Int64})
@ ClassImbalance ~/PP/MonitoriaEstatistica/Regression/dev/ClassImbalance/src/ub_smote.jl:89
[9] top-level scope
@ REPL[78]:1
I'm following the guide to logistic regression: https://www.machinelearningplus.com/julia/logistic-regression-in-julia-practical-guide-with-examples/
I couldn't reproduce the smote
command; the rest is working.
Yeah, I suspect that ClassImbalance won't work in newer versions of DataFrames.
To be honest, the whole package needs a rewrite from the ground up. I just haven't had the time yet.
Do you know of any alternative, in the meantime?
I see that you are a part of innumerable projects. I can't even imagine how your life must be. I participate in three projects and I nearly don't have a social life.
Take your time... Anyhow, very few people know of and are willing to treat Class Imbalance lol (although, I'm interested).
I found the error:
In this function,
function matrix_to_dataframe(X_new::Array{Float64, 2}, dat::DataFrames.DataFrame, factor_indcs::Array{Int, 1})
X_synth = DataFrames.DataFrame()
p = size(X_new, 2)
for j = 1:p
if j ∈ factor_indcs
X_synth[:, j] = float_to_factor(X_new[:, j],
DataFrames.levels(dat[:, j]))
else
X_synth[:, j] = X_new[:, j]
end
end
X_synth
end
We have (dat
has been defined as dat = DataFrames.DataFrame()
):
julia> float_to_factor(X_new[:,1],DataFrames.levels(dat[:,1]))
ERROR: BoundsError: attempt to access data frame with 0 columns at index [1]
Stacktrace:
[1] getindex
@ ~/.julia/packages/DataFrames/vuMM8/src/other/index.jl:183 [inlined]
[2] getindex(df::DataFrame, row_inds::Colon, col_ind::Int64)
@ DataFrames ~/.julia/packages/DataFrames/vuMM8/src/dataframe/dataframe.jl:499
Further insight, but irrelevant for now: It happens when,
matrix_to_dataframe(ones(3,3), DataFrames.DataFrame(), [1;2;3])
ERROR: TypeError: non-boolean (Int64) used in boolean context
Stacktrace:
[1] matrix_to_dataframe(X_new::Matrix{Float64}, dat::DataFrame, factor_indcs::Vector{Int64})
@ Main ./REPL[58]:5
[2] top-level scope
@ REPL[93]:1
When in turn, it's used in smote_obs
, then _smote.
I could use smote
locally, with further changes. Can you try to run the tests?
Thanks for the attention, @DilumAluthge
bors try