onnx / onnxmltools

ONNXMLTools enables conversion of models to ONNX
https://onnx.ai
Apache License 2.0
998 stars 181 forks source link

Missing converter for HashingTF #449

Open sansr opened 3 years ago

sansr commented 3 years ago

Hello everyone!

I am trying to convert an instance of HashingTF sparkml transformer. When I invoke convert_sparkml function I get an error that says that 'pyspark.ml.feature.HashingTF' is not supported.

I was looking into the source code and I discovered that 'pyspark.ml.feature.HashingTF' is a transformer built into get_sparkml_operator_name (inside ops_names.py) but later it doesn't appear into the map created in get_input_names function (that is inside ops_input_output.py), that is where it is check if a transformer/estimator is valir or not. Does this have any explanation why it is not supported? Or is a bug?

Any help is well appreciated. Thank you!

bipin2295 commented 3 years ago

Hi @sansr , Did you find the solution, I too have ran into same issue. I can see that HashingTF is present as a key in the build_sparkml_operator_name_map() in ops_names.py, yet it's throwing keyerror.

xadupre commented 3 years ago

There is no easy way to do hashes with ONNX. There is no dedicated operator and hashing with current operator is not straightforward. This implementation may take some time.