mandiant / stringsifter

A machine learning tool that ranks strings based on their relevance for malware analysis.
Apache License 2.0
675 stars 124 forks source link

Python 3.8 not supported #13

Closed eshaan7 closed 3 years ago

eshaan7 commented 4 years ago

Stringsifter depends on numpy==1.17.1 , scipy==1.3.1. These versions do not support Python 3.8. But in setup.py, it's mentioned that stringsifter supports python>=3.6.

ralphje commented 4 years ago

I feel that this hard requirement should be relaxed a little, at least not pinning on a patch release and perhaps just pinning to the major release, or defining a minimal release.

I.e. something like

lightgbm~=2.1
numpy~=1.17
scikit-learn>=0.21.3,<0.24
joblib>=0.13.2,<0.17
pytest~=3.10
fasttext~=0.9.1

Note that pytest is probably not a requirement for installation.

ewalshmndt commented 3 years ago

Unfortunately, using a newer version of scikit-learn produces warnings about the pickled models:

/home/ewalsh/.local/lib/python3.8/site-packages/sklearn/base.py:329: UserWarning: Trying to unpickle estimator TfidfTransformer from version 0.21.3 when using version 0.23.2. This might lead to breaking code or invalid results. Use at your own risk.
  warnings.warn(
/home/ewalsh/.local/lib/python3.8/site-packages/sklearn/base.py:329: UserWarning: Trying to unpickle estimator TfidfVectorizer from version 0.21.3 when using version 0.23.2. This might lead to breaking code or invalid results. Use at your own risk.
  warnings.warn(
/home/ewalsh/.local/lib/python3.8/site-packages/sklearn/base.py:329: UserWarning: Trying to unpickle estimator FeatureUnion from version 0.21.3 when using version 0.23.2. This might lead to breaking code or invalid results. Use at your own risk.
  warnings.warn(

The other relaxed constraints work in my testing with python3.8. The pytest dependency should not be there; this is issue #5 .

ewalshmndt commented 3 years ago

Working on this issue, plan is to create separate branches for each major version of Python

ewalshmndt commented 3 years ago

Fixed in version 2.20201202