mandiant / stringsifter

A machine learning tool that ranks strings based on their relevance for malware analysis.
Apache License 2.0
675 stars 124 forks source link

Python 3.10 compatibility #19

Closed coperni closed 1 year ago

coperni commented 2 years ago

Any chance of updating this for Python 3.10 support? The models are not compatible with scitkit-learn 1.0.2 (which is compatible with Python 3.10).

Can the models be reserialized with the latest scikit-learn?

phtully commented 2 years ago

hi @coperni, this is a good idea - do you have a stack trace as well as a readout of the desired libs you'd like this to be compatible with?

coperni commented 2 years ago

Hi @phtully! Thank you for the response.

These are the warnings I get when running the pytest under scikit-learn 1.0.2

Pytest warning dump ```python tests/test_stringsifter.py::test_default tests/test_stringsifter.py::test_scores tests/test_stringsifter.py::test_cutoff tests/test_stringsifter.py::test_cutoff_score tests/test_stringsifter.py::test_batch /home/ike/.local/lib/python3.9/site-packages/sklearn/base.py:329: UserWarning: Trying to unpickle estimator TfidfTransformer from version 0.23.2 when using version 1.0.2. This might lead to breaking code or invalid results. Use at your own risk. For more info please refer to: https://scikit-learn.org/stable/modules/model_persistence.html#security-maintainability-limitations warnings.warn( tests/test_stringsifter.py::test_default tests/test_stringsifter.py::test_scores tests/test_stringsifter.py::test_cutoff tests/test_stringsifter.py::test_cutoff_score tests/test_stringsifter.py::test_batch /home/ike/.local/lib/python3.9/site-packages/sklearn/base.py:329: UserWarning: Trying to unpickle estimator TfidfVectorizer from version 0.23.2 when using version 1.0.2. This might lead to breaking code or invalid results. Use at your own risk. For more info please refer to: https://scikit-learn.org/stable/modules/model_persistence.html#security-maintainability-limitations warnings.warn( tests/test_stringsifter.py::test_default tests/test_stringsifter.py::test_scores tests/test_stringsifter.py::test_cutoff tests/test_stringsifter.py::test_cutoff_score tests/test_stringsifter.py::test_batch /home/ike/.local/lib/python3.9/site-packages/sklearn/base.py:329: UserWarning: Trying to unpickle estimator FeatureUnion from version 0.23.2 when using version 1.0.2. This might lead to breaking code or invalid results. Use at your own risk. For more info please refer to: https://scikit-learn.org/stable/modules/model_persistence.html#security-maintainability-limitations warnings.warn( -- Docs: https://docs.pytest.org/en/stable/warnings.html ============================================================================================= short test summary info ============================================================================================= FAILED tests/test_stringsifter.py::test_default - sklearn.exceptions.NotFittedError: Estimator not fitted, call fit before exploiting the model. FAILED tests/test_stringsifter.py::test_scores - sklearn.exceptions.NotFittedError: Estimator not fitted, call fit before exploiting the model. FAILED tests/test_stringsifter.py::test_cutoff - sklearn.exceptions.NotFittedError: Estimator not fitted, call fit before exploiting the model. FAILED tests/test_stringsifter.py::test_cutoff_score - sklearn.exceptions.NotFittedError: Estimator not fitted, call fit before exploiting the model. FAILED tests/test_stringsifter.py::test_batch - sklearn.exceptions.NotFittedError: Estimator not fitted, call fit before exploiting the model. ```

The models I would need converted are:

markov.pkl featurizer.pkl ranker.pkl

ewalshmndt commented 1 year ago

Fixed by #34