Open Marie-LineIbrahim opened 1 year ago
Why not using sklearn-onnx to convert your model from scikit-learn to onnx?
Because I havnt found a way to include my output transformation for the kmeans before using sklearn_onnx
You can use function merge_models https://onnx.ai/onnx/api/compose.html#merge-models and merge your model with a simple model reshaping the output.
I have tried the following code but got the following error: import onnx model1 = onnx.load('pipeline20.onnx') model2 = onnx.load('pipeline21.onnx') combined_model = onnx.compose.merge_models(model1, model2, io_map=[("label","input1")])
initial_type3 = [('input1', DoubleTensorType([None,1])),('input2', DoubleTensorType([None,1])), ('input3', DoubleTensorType([None,1])),('input4', DoubleTensorType([None,1])), ('input5', DoubleTensorType([None,1])),('input6', DoubleTensorType([None,1])), ('input7', DoubleTensorType([None,1])),('input8', DoubleTensorType([None,1])), ('input9', DoubleTensorType([None,1]))] onx3 = skl2onnx.convert.convert_sklearn(combined_model, initial_types=initial_type3, target_opset={'': 15, 'ai.onnx.ml': 2})
onnx_path3 = 'pipeline22.onnx' with open(onnx_path3, 'wb') as f: f.write(onx3.SerializeToString())
the error: MissingShapeCalculator: Unable to find a shape calculator for type '<class 'onnx.onnx_ml_pb2.ModelProto'>'. It usually means the pipeline being converted contains a transformer or a predictor with no corresponding converter implemented in sklearn-onnx. If the converted is implemented in another library, you need to register the converted so that it can be used by sklearn-onnx (function update_registered_converter). If the model is not yet covered by sklearn-onnx, you may raise an issue to https://github.com/onnx/sklearn-onnx/issues to get the converter implemented or even contribute to the project. If the model is a custom model, a new converter must be implemented. Examples can be found in the gallery.
Describe the issue
I need to create a kmeans model with output reshaped to (-1,1) and then convert it to onnx.
I wrote the following code:
import numpy as np import onnx from onnx import numpy_helper from onnx import helper from onnx import AttributeProto, TensorProto from sklearn.cluster import KMeans
Create input placeholder
input_placeholder = helper.make_tensor_value_info('input_data', TensorProto.FLOAT, [None, 2])
Perform K-means clustering
kmeans = KMeans(nclusters=3) kmeans.fit(data) labels = kmeans.labels
Create ONNX model
model = helper.make_model( opset_imports=[helper.make_operatorsetid('', 12)], graph=helper.make_graph( nodes=[ helper.make_node('Identity', ['input_data'], ['output']), helper.make_node('Constant', [], ['labels'], value=numpy_helper.from_array(labels.astype(np.int64))), helper.make_node('Constant', [], ['centroids'], value=numpy_helper.from_array(kmeans.clustercenters.astype(np.float32))), helper.make_node('Unsqueeze', ['output'], ['output_unsqueezed'], axes=[1]), helper.make_node('Constant', [], ['shape_value'], value=numpy_helper.from_array(np.array([-1, 1], dtype=np.int64))), helper.make_node('Reshape', ['output_unsqueezed', 'shape_value'], ['reshaped_output']), helper.make_node('Squeeze', ['reshaped_output'], ['reshaped_output_squeezed'], axes=[1]), helper.make_node('Identity', ['reshaped_output_squeezed'], ['cluster_number']), # Output the cluster numbers directly ], name='kmeans_graph', inputs=[ input_placeholder ], outputs=[ helper.make_tensor_value_info('cluster_number', TensorProto.FLOAT, [None]) # Change output shape to [None] ], ) )
Save the ONNX model to a file
onnx.save_model(model, 'kmeans_model.onnx')
the response is:
[0.77988416 0.7326559 0.3233431 0.14046547 0.52645683 0.63215387 0.9591235 0.00784304 0.39958602 0.93583053 0.51616347 0.14006257 0.4178276 0.928936 0.9997206 0.19464517 0.28503856 0.9260069 0.7220202 0.41508904]
Instead of cluster number. I need my response to be the cluster for every input
To reproduce
import numpy as np import onnx from onnx import numpy_helper from onnx import helper from onnx import AttributeProto, TensorProto from sklearn.cluster import KMeans
Create input placeholder
input_placeholder = helper.make_tensor_value_info('input_data', TensorProto.FLOAT, [None, 2])
Perform K-means clustering
kmeans = KMeans(nclusters=3) kmeans.fit(data) labels = kmeans.labels
Create ONNX model
model = helper.make_model( opset_imports=[helper.make_operatorsetid('', 12)], graph=helper.make_graph( nodes=[ helper.make_node('Identity', ['input_data'], ['output']), helper.make_node('Constant', [], ['labels'], value=numpy_helper.from_array(labels.astype(np.int64))), helper.make_node('Constant', [], ['centroids'], value=numpy_helper.from_array(kmeans.clustercenters.astype(np.float32))), helper.make_node('Unsqueeze', ['output'], ['output_unsqueezed'], axes=[1]), helper.make_node('Constant', [], ['shape_value'], value=numpy_helper.from_array(np.array([-1, 1], dtype=np.int64))), helper.make_node('Reshape', ['output_unsqueezed', 'shape_value'], ['reshaped_output']), helper.make_node('Squeeze', ['reshaped_output'], ['reshaped_output_squeezed'], axes=[1]), helper.make_node('Identity', ['reshaped_output_squeezed'], ['cluster_number']), # Output the cluster numbers directly ], name='kmeans_graph', inputs=[ input_placeholder ], outputs=[ helper.make_tensor_value_info('cluster_number', TensorProto.FLOAT, [None]) # Change output shape to [None] ], ) )
Save the ONNX model to a file
onnx.save_model(model, 'kmeans_model.onnx')
Urgency
Very urgent.
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
3.10
PyTorch Version
I'm using tensorflow and sklearn
Execution Provider
Default CPU
Execution Provider Library Version
No response