microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
13.74k stars 2.79k forks source link

[Training] #16354

Open Marie-LineIbrahim opened 1 year ago

Marie-LineIbrahim commented 1 year ago

Describe the issue

I need to create a kmeans model with output reshaped to (-1,1) and then convert it to onnx.

I wrote the following code:

import numpy as np import onnx from onnx import numpy_helper from onnx import helper from onnx import AttributeProto, TensorProto from sklearn.cluster import KMeans

Create input placeholder

input_placeholder = helper.make_tensor_value_info('input_data', TensorProto.FLOAT, [None, 2])

Perform K-means clustering

kmeans = KMeans(nclusters=3) kmeans.fit(data) labels = kmeans.labels

Create ONNX model

model = helper.make_model( opset_imports=[helper.make_operatorsetid('', 12)], graph=helper.make_graph( nodes=[ helper.make_node('Identity', ['input_data'], ['output']), helper.make_node('Constant', [], ['labels'], value=numpy_helper.from_array(labels.astype(np.int64))), helper.make_node('Constant', [], ['centroids'], value=numpy_helper.from_array(kmeans.clustercenters.astype(np.float32))), helper.make_node('Unsqueeze', ['output'], ['output_unsqueezed'], axes=[1]), helper.make_node('Constant', [], ['shape_value'], value=numpy_helper.from_array(np.array([-1, 1], dtype=np.int64))), helper.make_node('Reshape', ['output_unsqueezed', 'shape_value'], ['reshaped_output']), helper.make_node('Squeeze', ['reshaped_output'], ['reshaped_output_squeezed'], axes=[1]), helper.make_node('Identity', ['reshaped_output_squeezed'], ['cluster_number']), # Output the cluster numbers directly ], name='kmeans_graph', inputs=[ input_placeholder ], outputs=[ helper.make_tensor_value_info('cluster_number', TensorProto.FLOAT, [None]) # Change output shape to [None] ], ) )

Save the ONNX model to a file

onnx.save_model(model, 'kmeans_model.onnx')

the response is:

[0.77988416 0.7326559 0.3233431 0.14046547 0.52645683 0.63215387 0.9591235 0.00784304 0.39958602 0.93583053 0.51616347 0.14006257 0.4178276 0.928936 0.9997206 0.19464517 0.28503856 0.9260069 0.7220202 0.41508904]

Instead of cluster number. I need my response to be the cluster for every input

To reproduce

import numpy as np import onnx from onnx import numpy_helper from onnx import helper from onnx import AttributeProto, TensorProto from sklearn.cluster import KMeans

Create input placeholder

input_placeholder = helper.make_tensor_value_info('input_data', TensorProto.FLOAT, [None, 2])

Perform K-means clustering

kmeans = KMeans(nclusters=3) kmeans.fit(data) labels = kmeans.labels

Create ONNX model

model = helper.make_model( opset_imports=[helper.make_operatorsetid('', 12)], graph=helper.make_graph( nodes=[ helper.make_node('Identity', ['input_data'], ['output']), helper.make_node('Constant', [], ['labels'], value=numpy_helper.from_array(labels.astype(np.int64))), helper.make_node('Constant', [], ['centroids'], value=numpy_helper.from_array(kmeans.clustercenters.astype(np.float32))), helper.make_node('Unsqueeze', ['output'], ['output_unsqueezed'], axes=[1]), helper.make_node('Constant', [], ['shape_value'], value=numpy_helper.from_array(np.array([-1, 1], dtype=np.int64))), helper.make_node('Reshape', ['output_unsqueezed', 'shape_value'], ['reshaped_output']), helper.make_node('Squeeze', ['reshaped_output'], ['reshaped_output_squeezed'], axes=[1]), helper.make_node('Identity', ['reshaped_output_squeezed'], ['cluster_number']), # Output the cluster numbers directly ], name='kmeans_graph', inputs=[ input_placeholder ], outputs=[ helper.make_tensor_value_info('cluster_number', TensorProto.FLOAT, [None]) # Change output shape to [None] ], ) )

Save the ONNX model to a file

onnx.save_model(model, 'kmeans_model.onnx')

Urgency

Very urgent.

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

3.10

PyTorch Version

I'm using tensorflow and sklearn

Execution Provider

Default CPU

Execution Provider Library Version

No response

xadupre commented 1 year ago

Why not using sklearn-onnx to convert your model from scikit-learn to onnx?

Marie-LineIbrahim commented 1 year ago

Because I havnt found a way to include my output transformation for the kmeans before using sklearn_onnx

xadupre commented 1 year ago

You can use function merge_models https://onnx.ai/onnx/api/compose.html#merge-models and merge your model with a simple model reshaping the output.

Marie-LineIbrahim commented 1 year ago

I have tried the following code but got the following error: import onnx model1 = onnx.load('pipeline20.onnx') model2 = onnx.load('pipeline21.onnx') combined_model = onnx.compose.merge_models(model1, model2, io_map=[("label","input1")])

initial_type3 = [('input1', DoubleTensorType([None,1])),('input2', DoubleTensorType([None,1])), ('input3', DoubleTensorType([None,1])),('input4', DoubleTensorType([None,1])), ('input5', DoubleTensorType([None,1])),('input6', DoubleTensorType([None,1])), ('input7', DoubleTensorType([None,1])),('input8', DoubleTensorType([None,1])), ('input9', DoubleTensorType([None,1]))] onx3 = skl2onnx.convert.convert_sklearn(combined_model, initial_types=initial_type3, target_opset={'': 15, 'ai.onnx.ml': 2})

Save the ONNX mode to a file

onnx_path3 = 'pipeline22.onnx' with open(onnx_path3, 'wb') as f: f.write(onx3.SerializeToString())

the error: MissingShapeCalculator: Unable to find a shape calculator for type '<class 'onnx.onnx_ml_pb2.ModelProto'>'. It usually means the pipeline being converted contains a transformer or a predictor with no corresponding converter implemented in sklearn-onnx. If the converted is implemented in another library, you need to register the converted so that it can be used by sklearn-onnx (function update_registered_converter). If the model is not yet covered by sklearn-onnx, you may raise an issue to https://github.com/onnx/sklearn-onnx/issues to get the converter implemented or even contribute to the project. If the model is a custom model, a new converter must be implemented. Examples can be found in the gallery.