Closed tak-wah closed 7 years ago
isn't it second column?
The first column is the label. I use the .conf format version and try to add label=0, but get the same wrong:
[LightGBM] [Info] Using column number 0 as label
Met Exceptions:
Unknown token a in data file
I try to use python version,get the tips as follow:
Traceback (most recent call last):
File "/root/anaconda3/lib/python3.6/site-packages/lightgbm-0.1-py3.6.egg/lightgbm/engine.py", line 163, in train
booster = Booster(params=params, train_set=train_set)
File "/root/anaconda3/lib/python3.6/site-packages/lightgbm-0.1-py3.6.egg/lightgbm/basic.py", line 1189, in __init__
train_set.construct().handle,
File "/root/anaconda3/lib/python3.6/site-packages/lightgbm-0.1-py3.6.egg/lightgbm/basic.py", line 787, in construct
categorical_feature=self.categorical_feature, params=self.params)
File "/root/anaconda3/lib/python3.6/site-packages/lightgbm-0.1-py3.6.egg/lightgbm/basic.py", line 652, in _lazy_init
self.__init_from_np2d(data, params_str, ref_dataset)
File "/root/anaconda3/lib/python3.6/site-packages/lightgbm-0.1-py3.6.egg/lightgbm/basic.py", line 699, in __init_from_np2d
data = np.array(mat.reshape(mat.size), dtype=np.float32)
ValueError: could not convert string to float: 'a'
@tak-wah
It seems you use file as input data.
currently, the non-numerical categorical features only be supported by Pandas in python-package.
If you need to pass categorical features by file, you should convert them to int
first.
@guolinke Thank you!
According to your tips, I convert [a, b, ...]
to [0, 1, ...]
and categorical_feature=0
, then it work well.
Another trouble that the multiclass
accuracy rate with low when i use /examples/multiclass_classification/multiclass.*
data sets.
Early stopping, best iteration is:
[60] training's multi_logloss: 1.21632 training's multi_error: 0.229 valid_1's multi_logloss: 1.41303 valid_1's multi_error: 0.542
Is it data sets reason?
The dataset is randomly generated.
This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.
I use my own dataset including the categorical feature, return the ValueError as follow: ValueError: could not convert string to float: 'a' the categorical feature in the first column, i set parameters : _categoricalfeature=0
The format of the partial data: 3 a -1.047 0.537 1.186 ... 1 b -0.151 -0.221 -0.090 ... ... ... ... ... ... ... 1 a 0.387 -1.660 0.684 ...