Open cppntn opened 4 years ago
Right now, I have no easy way to fix it. scikit-learn preprocesses the strings before extracting the characters and removes double spaces: https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/feature_extraction/text.py#L258. onnxruntime does not implement that behaviour. ONNX StringNormalizer only contains basic options: https://github.com/onnx/onnx/blob/master/docs/Operators.md.
I've tried but this error occurred,
NotImplementedError: CountVectorizer cannot be converted, only tokenizer='word' is supported. You may raise an issue at https://github.com/onnx/sklearn-onnx/issues.
which led me here to open this issue
Thanks for your support