microsoft / hummingbird

Hummingbird compiles trained ML models into tensor computation for faster inference.
MIT License
3.36k stars 279 forks source link

incompatible with XGBoost 1.5.0 due to breaking model format changes #561

Closed meta closed 2 years ago

meta commented 2 years ago

In xgboost 1.5.0 release, a breaking change is introduced to the model format causing this error when trying to use convert:

ValueError: invalid literal for int() with base 10: '<some_feature_name>'

The error happens because this line: https://github.com/microsoft/hummingbird/blob/main/hummingbird/ml/operator_converters/xgb.py#L33

Previously, XGBoost seemed to use an indexed integer to store the feature name, but now it's changed to the alphabet name of the feature. I tried changing the conversion to use other bases, say 36 to get some 1-1 integer map to the feature name, but is causing another issue down the road in the indexing. Seemed like the lib may need to build some indexing to the feature.

Dataset used: sklearn.datasets.load_boston

ksaur commented 2 years ago

Thanks, yes, this is a problem currently. (Linking #551 as well) And thank you for providing details, that will help us troubleshoot!

ksaur commented 2 years ago

Digging in: It seems in the pipeline run we do Downloading xgboost-1.5.1-py3-none-manylinux2014_x86_64.whl, which means our tests aren't catching this some how. Codecov says that we hit that line, but there must be some more things we need to try.

interesaaat commented 2 years ago

Yea how is that we don't hit this in our pipeline?

ksaur commented 2 years ago

@meta can you share more of your test code so we can repo the error?

meta commented 2 years ago

sure, here's the repro notebook: https://github.com/meta/notebooks/blob/main/hummingbird_xgboost.ipynb

interesaaat commented 2 years ago

Ok so apparently the problem is pandas used for training.

If you change xg_reg.fit(X_train, y_train) with xg_reg.fit(X_train.to_numpy(), y_train.to_numpy()) it should work.

For predict you can still use pandas, so you can do xg_torch.predict(X_test).

interesaaat commented 2 years ago

Still we will try to fix the pandas problem for training because it is not that convenient to force users to use numpy for training xgboost models.

finlytics-hub commented 1 year ago

I'm still receiving the same error using v0.4.9 with string feature names in the trained XGBClassifier model upon using convert(model, "torch").

Using to_numpy() in model.fit() works though.

Am I missing something here?

Complete stack trace:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/home/asad/Desktop/v2_final_model.ipynb Cell 12 line 2
      1 # convert XGB model to torch using hummingbird
----> 2 model_torch = convert(model, 'torch')
      4 print(model_torch)

File ~/anaconda3/envs/zkml-ezkl/lib/python3.10/site-packages/hummingbird/ml/convert.py:444, in convert(model, backend, test_input, device, extra_config)
    409 """
    410 This function converts the specified input *model* into an implementation targeting *backend*.
    411 *Convert* supports [Sklearn], [LightGBM], [XGBoost], [ONNX], and [SparkML] models.
   (...)
    441     A model implemented in *backend*, which is equivalent to the input model
    442 """
    443 assert constants.REMAINDER_SIZE not in extra_config
--> 444 return _convert_common(model, backend, test_input, device, extra_config)

File ~/anaconda3/envs/zkml-ezkl/lib/python3.10/site-packages/hummingbird/ml/convert.py:394, in _convert_common(model, backend, test_input, device, extra_config)
    391 _supported_backend_check_config(model, backend_formatted, extra_config)
    393 if type(model) in xgb_operator_list:
--> 394     return _convert_xgboost(model, backend_formatted, test_input, device, extra_config)
    396 if type(model) in lgbm_operator_list:
    397     return _convert_lightgbm(model, backend_formatted, test_input, device, extra_config)

File ~/anaconda3/envs/zkml-ezkl/lib/python3.10/site-packages/hummingbird/ml/convert.py:153, in _convert_xgboost(model, backend, test_input, device, extra_config)
    148 else:
    149     raise RuntimeError(
    150         "XGBoost converter is not able to infer the number of input features.\
    151             Please pass some test_input to the converter."
    152     )
--> 153 return _convert_sklearn(model, backend, test_input, device, extra_config)

File ~/anaconda3/envs/zkml-ezkl/lib/python3.10/site-packages/hummingbird/ml/convert.py:111, in _convert_sklearn(model, backend, test_input, device, extra_config)
    108 topology = parse_sklearn_api_model(model, extra_config)
    110 # Convert the Topology object into a PyTorch model.
--> 111 hb_model = topology_converter(topology, backend, test_input, device, extra_config=extra_config)
    112 return hb_model

File ~/anaconda3/envs/zkml-ezkl/lib/python3.10/site-packages/hummingbird/ml/_topology.py:222, in convert(topology, backend, test_input, device, extra_config)
    215         if parse(torch.__version__) <= Version("1.4"):
    216             # Raise en error and warn user that the torch version is not supported with onnx backend
    217             raise Exception(
    218                 f"The current torch version {torch.__version__} is not supported with {backend} backend. "
    219                 "Please use a torch version > 1.4 or change the backend."
    220             )
--> 222     operator_map[operator.full_name] = converter(operator, device, extra_config)
    224 # Set the parameters for the model / container
    225 n_threads = None if constants.N_THREADS not in extra_config else extra_config[constants.N_THREADS]

File ~/anaconda3/envs/zkml-ezkl/lib/python3.10/site-packages/hummingbird/ml/operator_converters/xgb.py:107, in convert_sklearn_xgb_classifier(operator, device, extra_config)
    104 tree_infos = operator.raw_operator.get_booster().get_dump()
    105 n_classes = operator.raw_operator.n_classes_
--> 107 return convert_gbdt_classifier_common(
    108     operator, tree_infos, _get_tree_parameters, n_features, n_classes, decision_cond="<", extra_config=extra_config
    109 )

File ~/anaconda3/envs/zkml-ezkl/lib/python3.10/site-packages/hummingbird/ml/operator_converters/_gbdt_commons.py:63, in convert_gbdt_classifier_common(operator, tree_infos, get_tree_parameters, n_features, n_classes, classes, extra_config, decision_cond)
     60 if reorder_trees and n_classes > 1:
     61     tree_infos = [tree_infos[i * n_classes + j] for j in range(n_classes) for i in range(len(tree_infos) // n_classes)]
---> 63 return convert_gbdt_common(
     64     operator, tree_infos, get_tree_parameters, n_features, classes, extra_config=extra_config, decision_cond=decision_cond
     65 )

File ~/anaconda3/envs/zkml-ezkl/lib/python3.10/site-packages/hummingbird/ml/operator_converters/_gbdt_commons.py:89, in convert_gbdt_common(operator, tree_infos, get_tree_parameters, n_features, classes, extra_config, decision_cond)
     86 assert get_tree_parameters is not None
     87 assert n_features is not None
---> 89 tree_parameters, max_depth, tree_type = get_tree_params_and_type(tree_infos, get_tree_parameters, extra_config)
     91 # Apply learning rate directly on the values rather then at runtime.
     92 if constants.LEARNING_RATE in extra_config:

File ~/anaconda3/envs/zkml-ezkl/lib/python3.10/site-packages/hummingbird/ml/operator_converters/_tree_commons.py:223, in get_tree_params_and_type(tree_infos, get_tree_parameters, extra_config)
    210 def get_tree_params_and_type(tree_infos, get_tree_parameters, extra_config):
    211     """
    212     Populate the parameters from the trees and pick the tree implementation strategy.
    213 
   (...)
    221         The tree parameters, the maximum tree-depth and the tre implementation to use
    222     """
--> 223     tree_parameters = [get_tree_parameters(tree_info, extra_config) for tree_info in tree_infos]
    224     max_depth = max(1, _find_max_depth(tree_parameters))
    225     tree_type = get_tree_implementation_by_config_or_depth(extra_config, max_depth)

File ~/anaconda3/envs/zkml-ezkl/lib/python3.10/site-packages/hummingbird/ml/operator_converters/_tree_commons.py:223, in <listcomp>(.0)
    210 def get_tree_params_and_type(tree_infos, get_tree_parameters, extra_config):
    211     """
    212     Populate the parameters from the trees and pick the tree implementation strategy.
    213 
   (...)
    221         The tree parameters, the maximum tree-depth and the tre implementation to use
    222     """
--> 223     tree_parameters = [get_tree_parameters(tree_info, extra_config) for tree_info in tree_infos]
    224     max_depth = max(1, _find_max_depth(tree_parameters))
    225     tree_type = get_tree_implementation_by_config_or_depth(extra_config, max_depth)

File ~/anaconda3/envs/zkml-ezkl/lib/python3.10/site-packages/hummingbird/ml/operator_converters/xgb.py:77, in _get_tree_parameters(tree_info, extra_config)
     75     for f_id, f_name in enumerate(feature_names):
     76         tree_info = tree_info.replace(f_name, str(f_id))
---> 77 _tree_traversal(
     78     tree_info.replace("[f", "").replace("[", "").replace("]", "").split(), lefts, rights, features, thresholds, values
     79 )
     81 return TreeParameters(lefts, rights, features, thresholds, values)

File ~/anaconda3/envs/zkml-ezkl/lib/python3.10/site-packages/hummingbird/ml/operator_converters/xgb.py:33, in _tree_traversal(tree_info, lefts, rights, features, thresholds, values)
     31     count += 1
     32 else:
---> 33     features.append(int(tree_info[count].split(":")[1].split("<")[0].replace("[f", "")))
     34     thresholds.append(float(tree_info[count].split(":")[1].split("<")[1].replace("]", "")))
     35     values.append([-1])

ValueError: invalid literal for int() with base 10: 'liquidation_time_since_last_liquidated'
interesaaat commented 1 year ago

It looks that his happening because you have a categorical feature which we don't support yet.

finlytics-hub commented 1 year ago

Thanks for the quick revert. However, all feature values are scaled floats.