dmlc / XGBoost.jl

XGBoost Julia Package
Other
289 stars 110 forks source link

Documentation up to date? #82

Closed Sinansi closed 4 years ago

Sinansi commented 4 years ago

Hello

Is the documentation up to date? I cant even get a single example work.

num_round = 2 bst = xgboost(train_X, num_round, label = train_Y, eta = 1, max_depth = 2)

returns error: got unsupported keyword argument "label"

and when I remove the label keyword I get another error ERROR: MethodError: no method matching xgboost(::Array{Float64,1}, ::Int64, ::Array{Int64,1}; eta=1, max_depth=2)

Are Julia arrays compatible with XGBoost.jl? For boolean target, should I convert it to integer?

I use Julia 1.2, "XGBoost" => v"0.4.2" Thank you!

Sinansi commented 4 years ago

train_X, train_Y = readlibsvm("data/agaricus.txt.train", (6513, 126))

ERROR: UndefVarError: readlibsvm not defined

dpk1729 commented 4 years ago

Can you see if this works for you

using XGBoost
dtrain = DMatrix("~/.julia/packages/.../data/agaricus.txt.train")
num_round = 2
bst = xgboost(dtrain, num_round, eta = 1, max_depth = 2)
Sinansi commented 4 years ago

Hi @dpk1729 thanks for your reply.

Actually, it was my bad. All of the above errors were due to wrong shape and data structure. The training dataset I have is one vector only (not a matrix as usual) so the value that will be used as input for prediction is a scalar. I had to convert the scalar into an array and then reshape it into a column array before using it as input for prediction.

Everything is working fine now, and I have switched my work from Python XGBoost to Julia XGBoost. It is a gazillion times faster. I am happy with Julia and I would love to see full Julia implementation in the future, without any dependence on Python.

sharmaabhishekk commented 4 years ago

Hi @dpk1729 thanks for your reply.

Actually, it was my bad. All of the above errors were due to wrong shape and data structure. The training dataset I have is one vector only (not a matrix as usual) so the value that will be used as input for prediction is a scalar. I had to convert the scalar into an array and then reshape it into a column array before using it as input for prediction.

Everything is working fine now, and I have switched my work from Python XGBoost to Julia XGBoost. It is a gazillion times faster. I am happy with Julia and I would love to see full Julia implementation in the future, without any dependence on Python.

Could you please let me what exactly worked out for you? I'm having the exact same error, as far as I understand. My training features data (train_X) is a julia array with size = (22000, 5). The labels (train_y) is also a julia array - a vector of size (22000, 1). I have tried reshaping the labels to size (22000,1,1) and the same with the training data (22000,5,1) as you mentioned something about it expecting vectors but getting scalars but it doesn't seem to find any matching methods.

Here's the exact error message I get (without the trace):

MethodError: no method matching (::getfield(XGBoost, Symbol("#_setinfo#8")))(::Ptr{Nothing}, ::String, ::Array{Int32,2})