elastic / eland

Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
https://eland.readthedocs.io
Apache License 2.0
18 stars 98 forks source link

Minimize if main section #554

Closed sakurai-youhei closed 1 year ago

sakurai-youhei commented 1 year ago

Relates https://github.com/elastic/eland/pull/552

Issue:

For migration from scripts to console_scripts in setup.py, the current long if __name__ == "__main__": section is a blocker because the console_scripts requires to specify a function as an entrypoint.

Solution:

I'm encapceling the current main logic into main() function through this PR.

sakurai-youhei commented 1 year ago

Thanks @pquentin . Could you help me rerun checks? Tests failed at the cases where perhaps unrelated like this.

23:48:36 tests/ml/test_ml_model_pytest.py:615: 
23:48:36 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
23:48:36 eland/ml/ml_model.py:466: in export_model
23:48:36     model = ESGradientBoostingRegressor(
23:48:36 eland/ml/exporters/es_gb_models.py:393: in __init__
23:48:36     ESGradientBoostingModel.__init__(self, es_client, model_id)
23:48:36 eland/ml/exporters/es_gb_models.py:114: in __init__
23:48:36     self._trees.append(Tree(trained_model["tree"], feature_names_map))
23:48:36 eland/ml/exporters/_sklearn_deserializers.py:119: in __init__
23:48:36     self.tree.__setstate__(state)
23:48:36 sklearn/tree/_tree.pyx:714: in sklearn.tree._tree.Tree.__setstate__
23:48:36     ???
23:48:36 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
23:48:36 
23:48:36 >   ???
23:48:36 E   ValueError: node array from the pickle has an incompatible dtype:
23:48:36 E   - expected: {'names': ['left_child', 'right_child', 'feature', 'threshold', 'impurity', 'n_node_samples', 'weighted_n_node_samples', 'missing_go_to_left'], 'formats': ['<i8', '<i8', '<i8', '<f8', '<f8', '<i8', '<f8', 'u1'], 'offsets': [0, 8, 16, 24, 32, 40, 48, 56], 'itemsize': 64}
23:48:36 E   - got     : [('left_child', '<i8'), ('right_child', '<i8'), ('feature', '<i8'), ('threshold', '<f8'), ('impurity', '<f8'), ('n_node_samples', '<i8'), ('weighted_n_node_samples', '<f8')]
pquentin commented 1 year ago

Let's see if that's enough:

@elasticmachine test this please

davidkyle commented 1 year ago

The failure is from an SK Learn model export test and probably due to a breaking change in the recent 1.3 release. I opened #555 to track the issue.

The cause is unrelated to this PR and shouldn't block merging so I have merged with the test failure.