Describe the bug
An error is raised when making an inference with a converted sklearn model built with CountVectorizer(binary=True). It's ok if binary=False
To Reproduce
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
from pure_sklearn.map import convert_estimator
vectorizer = CountVectorizer(binary=True)
model = LogisticRegression(random_state=0)
pipeline = Pipeline([
('vect', vectorizer),
('clf', model)
])
X_train = ['one text', 'two text', 'three text']
y_train = ['1', '2', '3']
pipeline.fit(X_train, y_train)
converted = convert_estimator(pipeline)
converted.predict(['four'])
It's ok if a vectorizer is created with binary=False.
Expected behavior
There shouldn't be any errors.
Additional context
Add any other context about the problem here.
Describe the bug An error is raised when making an inference with a converted sklearn model built with
CountVectorizer(binary=True)
. It's ok ifbinary=False
To Reproduce
It's ok if a vectorizer is created with
binary=False
.Expected behavior There shouldn't be any errors.
Additional context Add any other context about the problem here.