Open AMR-KELEG opened 6 years ago
Hello @AMR-KELEG, thanks, can you provide some data or an official dataset for testing?
from sklearn.feature_extraction.text import CountVectorizer
from sklearn_porter import Porter
cv = CountVectorizer()
l = ['Pattern 1', 'Pattern 2', 'Pattern 3']
X = cv.fit_transform(l)
y = [1, 2, 3]
clf = svm.SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
decision_function_shape='ovr', degree=3, gamma=1/X.shape[1], kernel='linear',
max_iter=-1, probability=False, random_state=None, shrinking=True,
tol=0.001, verbose=False)
clf.fit(X, y)
porter = Porter(clf, language='java')
output = porter.export(embed_data=False, details=False)
with open('SVC.java', 'w') as f:
f.writelines(output)
The problem is that CountVectorizer returns a sparse matrix which doesn't support basic len function.
I will try to fix this problem and create a pull request.
I have an svm/svc classifier trained using sparse matrix as follows:
The problem is that exporting fails with the errors shown bellow: