Closed vousmevoyez closed 7 years ago
test both python 2 and 3, no error. try latest code
Still has error. I have tried the latest code. Below is the complete error:
ValueError Traceback (most recent call last)
<ipython-input-14-920de1b50449> in <module>()
----> 1 gbm.feature_importance()
/home/admin/anaconda2/lib/python2.7/site-packages/lightgbm-0.2-py2.7.egg/lightgbm/basic.pyc in feature_importance(self, importance_type)
1662 if importance_type not in ["split", "gain"]:
1663 raise KeyError("importance_type must be split or gain")
-> 1664 dump_model = self.dump_model()
1665 ret = [0] * (dump_model["max_feature_idx"] + 1)
1666
/home/admin/anaconda2/lib/python2.7/site-packages/lightgbm-0.2-py2.7.egg/lightgbm/basic.pyc in dump_model(self, num_iteration)
1577 ctypes.byref(tmp_out_len),
1578 ptr_string_buffer))
-> 1579 return json.loads(string_buffer.value.decode())
1580
1581 def predict(self, data, num_iteration=-1, raw_score=False, pred_leaf=False, data_has_header=False, is_reshape=True,
/home/admin/anaconda2/lib/python2.7/json/__init__.pyc in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
337 parse_int is None and parse_float is None and
338 parse_constant is None and object_pairs_hook is None and not kw):
--> 339 return _default_decoder.decode(s)
340 if cls is None:
341 cls = JSONDecoder
/home/admin/anaconda2/lib/python2.7/json/decoder.pyc in decode(self, s, _w)
362
363 """
--> 364 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
365 end = _w(s, end).end()
366 if end != len(s):
/home/admin/anaconda2/lib/python2.7/json/decoder.pyc in raw_decode(self, s, idx)
380 obj, end = self.scan_once(s, idx)
381 except StopIteration:
--> 382 raise ValueError("No JSON object could be decoded")
383 return obj, end
ValueError: No JSON object could be decoded
try set num_boost_round=1 to see if it works.
btw, you should quote your error msg with ```
It works. But why does this happen?
feature importances use a string buffer passed from c++ to python, I guess the string buffer for 250 rounds is too long and be cut during passing.
Sorry. My OS is NOT Windows. It's linux.
Oh, sorry, misread it.
Strange I set num_boost_round to 1M and still cannot reproduce it. You can change this line to return string_buffer.value.decode()
, set num_boost_round to a big number and save gbm.dump_model()
to some files, upload here. We can see if it's been cut.
I did what you posted. But I can't gbm.dump_model()
. It raises error as below. How about gbm.save_model()
as txt format?
<ipython-input-17-cf366c50211c> in <module>()
----> 1 gbm.dump_model()
/home/admin/anaconda2/lib/python2.7/site-packages/lightgbm-0.2-py2.7.egg/lightgbm/basic.pyc in dump_model(self, num_iteration)
1577 ctypes.byref(tmp_out_len),
1578 ptr_string_buffer))
-> 1579 return string_buffer.value.decode()
1580
1581 def predict(self, data, num_iteration=-1, raw_score=False, pred_leaf=False, data_has_header=False, is_reshape=True,
/home/admin/anaconda2/lib/python2.7/json/__init__.pyc in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
337 parse_int is None and parse_float is None and
338 parse_constant is None and object_pairs_hook is None and not kw):
--> 339 return _default_decoder.decode(s)
340 if cls is None:
341 cls = JSONDecoder
/home/admin/anaconda2/lib/python2.7/json/decoder.pyc in decode(self, s, _w)
362
363 """
--> 364 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
365 end = _w(s, end).end()
366 if end != len(s):
/home/admin/anaconda2/lib/python2.7/json/decoder.pyc in raw_decode(self, s, idx)
380 obj, end = self.scan_once(s, idx)
381 except StopIteration:
--> 382 raise ValueError("No JSON object could be decoded")
383 return obj, end
ValueError: No JSON object could be decoded
FYI, my data has 4.6 million rows and 220 columns.
It's strange, json.loads already removed, why still show "No JSON object could be decoded"? I still try reproduce this issue, need some time.
I rerun my code today. Error is different:
TypeError Traceback (most recent call last)
<ipython-input-17-6f3b6c156ac1> in <module>()
----> 1 bst.feature_importance()
/home/admin/anaconda2/lib/python2.7/site-packages/lightgbm-0.2-py2.7.egg/lightgbm/basic.pyc in feature_importance(self, importance_type)
1663 raise KeyError("importance_type must be split or gain")
1664 dump_model = self.dump_model()
-> 1665 ret = [0] * (dump_model["max_feature_idx"] + 1)
1666
1667 def dfs(root):
TypeError: string indices must be integers
dump_model() seems work, can you try dump_model() again?
Strange that model.json seems not been cut. Try this:
import json
json.loads(gbm.dump_model())
ValueError Traceback (most recent call last)
<ipython-input-61-7d5d098ecca5> in <module>()
1 import json
----> 2 json.loads(gbm.dump_model())
/home/admin/anaconda2/lib/python2.7/json/__init__.pyc in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
337 parse_int is None and parse_float is None and
338 parse_constant is None and object_pairs_hook is None and not kw):
--> 339 return _default_decoder.decode(s)
340 if cls is None:
341 cls = JSONDecoder
/home/admin/anaconda2/lib/python2.7/json/decoder.pyc in decode(self, s, _w)
362
363 """
--> 364 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
365 end = _w(s, end).end()
366 if end != len(s):
/home/admin/anaconda2/lib/python2.7/json/decoder.pyc in raw_decode(self, s, idx)
380 obj, end = self.scan_once(s, idx)
381 except StopIteration:
--> 382 raise ValueError("No JSON object could be decoded")
383 return obj, end
ValueError: No JSON object could be decoded
I think I find out the reason. Can you also save_model() and upload here?
Thanks for your help. You can change this line https://github.com/Microsoft/LightGBM/blob/master/src/io/tree.cpp#L369 to str_buf << "\"threshold\":" << Common::AvoidInf(threshold_[index]) << "," << std::endl;
for temp solution, and change python-package back. The infinite number cannot be handled by json. I will fix this later.
@wxchan You can fix this line: https://github.com/Microsoft/LightGBM/blob/master/src/io/tree.cpp#L85
@guolinke add Common::AvoidInf to threshold_double? I will keep the change on L369, it helps when user loads old model.
@wxchan yes and okay.
@wxchan Thanks. it works.
@vousmevoyez you can pip install simplejson, it's more efficient and has better error message.
OK.
This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.
Environment info
Operating System: Linux CPU: Python version: Python 2.7.13
Error Message:
ValueError: No JSON object could be decoded
Reproducible examples
lgb_train = lgb.Dataset(X_train, y_train) lgb_eval = lgb.Dataset(X_test, y_test, reference=lgb_train) params = { 'task':'train', 'boosting':'gbdt', 'objective':'binary', 'metric':{'l2', 'auc'}, 'num_leaves': 62, 'learning_rate': 0.05, 'feature_fraction': 0.9, 'bagging_fraction': 0.8, 'bagging_freq': 5, 'verbose': 20 } gbm = lgb.train(params, lgb_train, num_boost_round=250, valid_sets=lgb_eval)
print('Start predicting...')
y_pred = gbm.predict(X_test, num_iteration=gbm.best_iteration) y_pred = np.round(y_pred)
print gbm.feature_importance()