Improving XGBoost survival analysis with embeddings and debiased estimators
xgboost 1.4.0+: ValueError: If using all scalar values, you must pass an index #31

Using xgboost 1.4.0 or 1.4.1, we are now getting an error: ValueError: If using all scalar values, you must pass an index

No error with 1.3.3

All releases after 1.3.3, we're receiving a ValueError upon call. Tested in Python 3.7.2 and 3.8.6


C:\Users\jacob\Envs\xgbse_gqc_38\lib\site-packages\xgboost\ UserWarning: ntree_limit is deprecated, use iteration_range or model slicing instead.
Traceback (most recent call last):
  File "F:/git/gqc/pipe_breaks/", line 111, in <module>
  File "F:/git/gqc/pipe_breaks/", line 100, in main
  File "F:\git\gqc\pipe_breaks\algorithms\", line 271, in main
    do_extrapolation(X=X, X_valid=X_valid, X_train=X_train, y_train=y_train, main_ids=id_column)
  File "F:\git\gqc\pipe_breaks\algorithms\", line 122, in do_extrapolation
    bootstrap_estimator, mean, upper_ci, lower_ci = fit_predict_bootstrap_est(
  File "F:\git\gqc\pipe_breaks\algorithms\", line 208, in fit_predict_bootstrap_est
  File "C:\Users\jacob\Envs\xgbse_gqc_38\lib\site-packages\xgbse\", line 57, in fit
    trained_model =, y_sample, **kwargs)
  File "C:\Users\jacob\Envs\xgbse_gqc_38\lib\site-packages\xgbse\", line 407, in fit
    pd.DataFrame({"leaf": leaves})
  File "C:\Users\jacob\Envs\xgbse_gqc_38\lib\site-packages\pandas\core\", line 467, in __init__
    mgr = init_dict(data, index, columns, dtype=dtype)
  File "C:\Users\jacob\Envs\xgbse_gqc_38\lib\site-packages\pandas\core\internals\", line 283, in init_dict
    return arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
  File "C:\Users\jacob\Envs\xgbse_gqc_38\lib\site-packages\pandas\core\internals\", line 78, in arrays_to_mgr
    index = extract_index(arrays)
  File "C:\Users\jacob\Envs\xgbse_gqc_38\lib\site-packages\pandas\core\internals\", line 387, in extract_index
    raise ValueError("If using all scalar values, you must pass an index")
ValueError: If using all scalar values, you must pass an index

Throwing code block:

def fit_predict_bootstrap_est(base_model, n_estimators, X_train, y_train, X_valid):
    """Instantiate, fit, and predict a bootstrap_estimator."""
    bootstrap_estimator = XGBSEBootstrapEstimator(base_model, n_estimators=n_estimators)

I'm unable to share specific data of the train structures, but their types and shapes follow: X_train = DataFrame: (2916, 11) y_train = ndarray: (4916,) TIME_BINS = np.arange(5, 540, 5)

Hi, sorry for the change. The output of the predict function is changed for consistent shape. Before it's (n_samples, ), and now it's (n_samples, 1). We didn't anticipate this to be a problem. The quickest fix is just to call np.reshape and get rid of the last dimension. I'm not sure should I change it for all the calls to predict function, or should we revert the change in output shape.

See . In general I recommend using the strict_shape parameter for xgboost 1.4.x

Thanks @jacobgqc for your report and thank you very much @trivialfis for the help in finding the cause of the issue. We'll proceed with just the reshape fix and look into using strict_shape for the next version.

Thanks! I opened a PR in xgboost to revert the change, see above link. If it's merged then we don't need any change in this project.

See .

If everything goes well I should prepare the release next week.

Hi, sorry for the long delay. 1.4.2 is out today.

Thanks @trivialfis for the communication and the help with this issue, it's fixed in xgboost 1.4.2