nok / sklearn-porter

Transpile trained scikit-learn estimators to C, Java, JavaScript and others.
BSD 3-Clause "New" or "Revised" License
1.28k stars 170 forks source link

Errors when porting LinearSVC model #18

Closed FakeNameSE closed 6 years ago

FakeNameSE commented 7 years ago

Sorry to bother you again, but when attempting to run: python3 -m sklearn_porter -i model_notokenizer.pkl -l java I get:

Traceback (most recent call last):
  File "/usr/lib/python3.5/runpy.py", line 184, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.5/dist-packages/sklearn_porter/__main__.py", line 71, in <module>
    main()
  File "/usr/local/lib/python3.5/dist-packages/sklearn_porter/__main__.py", line 49, in main
    porter = Porter(model, language=language)
  File "/usr/local/lib/python3.5/dist-packages/sklearn_porter/Porter.py", line 65, in __init__
    raise ValueError(error)
ValueError: The given model 'Pipeline(memory=None,
     steps=[('vect', TfidfVectorizer(analyzer='word', binary=False, decode_error='strict',
        dtype=<class 'numpy.int64'>, encoding='utf-8', input='content',
        lowercase=True, max_df=0.5, max_features=None, min_df=0.001,
        ngram_range=(1, 1), norm='l2', preprocessor=None, smooth_idf=True...ax_iter=1000,
     multi_class='ovr', penalty='l2', random_state=None, tol=0.0001,
     verbose=0))])' isn't supported.

I'm running python 3.5.2, numpy 1.13.1, and sklearn 0.19.0.

nok commented 7 years ago

Hello @FakeNameSE,

no problem! I'm happy about each serious issue and hint. It seems that I have to add support for the new pipeline feature of scikit-learn 0.19. I will fix it.

Happy coding, Darius

nok commented 7 years ago

Hello @FakeNameSE,

I pushed the fix https://github.com/nok/sklearn-porter/commit/b444da7d8e1dd2570110f19784a75fa395fe3d15 to pypi. Now you can reinstall the package with:

pip install --no-cache-dir --force-reinstall --ignore-installed sklearn-porter

Again thanks for your attention and help.

Happy coding, Darius

FakeNameSE commented 7 years ago

Unfortunately, I still seem to be getting the error.

Traceback (most recent call last):
  File "/usr/lib/python3.5/runpy.py", line 184, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.5/dist-packages/sklearn_porter/__main__.py", line 71, in <module>
    main()
  File "/usr/local/lib/python3.5/dist-packages/sklearn_porter/__main__.py", line 49, in main
    porter = Porter(model, language=language)
  File "/usr/local/lib/python3.5/dist-packages/sklearn_porter/Porter.py", line 77, in __init__
    raise ValueError(error)
ValueError: The given model 'Pipeline(memory=None,
     steps=[('vect', TfidfVectorizer(analyzer='word', binary=False, decode_error='strict',
        dtype=<class 'numpy.int64'>, encoding='utf-8', input='content',
        lowercase=True, max_df=0.5, max_features=None, min_df=0.001,
        ngram_range=(1, 1), norm='l2', preprocessor=None, smooth_idf=True...ax_iter=1000,
     multi_class='ovr', penalty='l2', random_state=None, tol=0.0001,
     verbose=0))])' isn't supported.
nok commented 7 years ago

How did you create the estimator and fit the model? Can you share your code?

In addition can you please execute the following snippet on the estimator before you dump the model as a pickle file:

algorithm_name = str(type(your_classifier).__name__)
print(algorithm_name)
Phyks commented 6 years ago

Hi,

I seem to be having the same issue.

> algorithm_name = str(type(classifier).__name__)
Pipeline

My code is available at https://github.com/Phyks/OFFClassification/blob/master/notebook.ipynb. In particular,

classifier = Pipeline([
    ('vectorizer', CountVectorizer()),
    ('tfidf', TfidfTransformer()),
    ('clf', OneVsRestClassifier(LinearSVC()))
])

classifier.fit(X_train, Y_train_transformed)

Thanks!

nok commented 6 years ago

Thanks @Phyks, the fix (#b92edff) is pushed to the master (dev) branch:

pip uninstall -y sklearn-porter
pip install --no-cache-dir https://github.com/nok/sklearn-porter/zipball/master

Happy coding, Darius

Phyks commented 6 years ago

Indeed, it fixed it (partially). I am now getting

ValueError: Currently the given model 'OneVsRestClassifier(estimator=LinearSVC(C=1.0, class_weight=None, dual=True, fit_intercept=True,
     intercept_scaling=1, loss='squared_hinge', max_iter=1000,
     multi_class='ovr', penalty='l2', random_state=None, tol=0.0001,
     verbose=0),
          n_jobs=1)' isn't supported.

which makes sense since it seems OneVsRestClassifier is not yet available. Are there any plans to support it? Would be awesome :)

nok commented 6 years ago

Thanks! Step by step ... the neverending story 😄 .

Yes, the support of the OneVsRestClassifier would be awesome! For that I created a new issue (https://github.com/nok/sklearn-porter/issues/19). This one will be closed right after my posting.

Thank you all for your feedback and help, Darius

Phyks commented 6 years ago

Awesome! I'm following the other issue for updates on this then. Thanks a lot for this lib!