Living-with-machines / DeezyMatch

A Flexible Deep Learning Approach to Fuzzy String Matching
https://living-with-machines.github.io/DeezyMatch/
Other
139 stars 34 forks source link

Change requirements.txt from == to >= #96

Closed fedenanni closed 3 years ago

fedenanni commented 3 years ago

This addresses the fact that having strict versions of libraries raise incompatibilities with other tools, e.g. sentence-transformers

Closes #95

kasra-hosseini commented 3 years ago

@fedenanni Great, many thanks for this. Could you please take a look at the warnings (in Test with pytest section):

https://github.com/Living-with-machines/DeezyMatch/runs/2793321336?check_suite_focus=true

fedenanni commented 3 years ago

@kasra-hosseini here is the summary

=============================== warnings summary ===============================
DeezyMatch/tests/test_import.py::test_import
  /usr/share/miniconda/lib/python3.8/site-packages/pandas/_testing.py:24: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
  Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
    import pandas._libs.testing as _testing

DeezyMatch/tests/test_pipeline.py: 15 warnings
  /usr/share/miniconda/lib/python3.8/site-packages/pandas/core/internals/construction.py:587: DeprecationWarning: `np.object` is a deprecated alias for the builtin `object`. To silence this warning, use `object` by itself. Doing this will not modify any behavior and is safe. 
  Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
    if dtype != object and dtype != np.object:

DeezyMatch/tests/test_pipeline.py: 607 warnings
  /usr/share/miniconda/lib/python3.8/site-packages/pandas/core/indexes/base.py:395: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
  Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
    elif issubclass(data.dtype.type, np.bool) or is_bool_dtype(data):

DeezyMatch/tests/test_pipeline.py: 16 warnings
  /usr/share/miniconda/lib/python3.8/site-packages/pandas/core/internals/blocks.py:839: DeprecationWarning: `np.complex` is a deprecated alias for the builtin `complex`. To silence this warning, use `complex` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.complex128` here.
  Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
    value = convert_scalar(values, value)

DeezyMatch/tests/test_pipeline.py::test_train
DeezyMatch/tests/test_pipeline.py::test_finetune
  /usr/share/miniconda/lib/python3.8/site-packages/pandas/core/algorithms.py:767: DeprecationWarning: `np.object` is a deprecated alias for the builtin `object`. To silence this warning, use `object` by itself. Doing this will not modify any behavior and is safe. 
  Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
    keys, counts = f(values, dropna)

DeezyMatch/tests/test_pipeline.py: 10 warnings
  /usr/share/miniconda/lib/python3.8/site-packages/tqdm/std.py:668: FutureWarning: The Panel class is removed from pandas. Accessing it from the top-level namespace will also be removed in the next version
    from pandas import Panel

DeezyMatch/tests/test_pipeline.py: 78005 warnings
  /usr/share/miniconda/lib/python3.8/site-packages/DeezyMatch/data_processing.py:300: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
  Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
    y = self.df.label.iloc[idx].astype(np.int)

-- Docs: https://docs.pytest.org/en/stable/warnings.html
================= 9 passed, 78656 warnings in 96.31s (0:01:36) =================

to which one are you referring in particular?

kasra-hosseini commented 3 years ago

Following our discussion, I was thinking maybe we can resolve some of the DeprecationWarnings by updating/adapting DeezyMatch source codes, e.g.:

DeezyMatch/tests/test_pipeline.py: 78005 warnings
  /usr/share/miniconda/lib/python3.8/site-packages/DeezyMatch/data_processing.py:300: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
  Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
    y = self.df.label.iloc[idx].astype(np.int)
fedenanni commented 3 years ago

@kasra-hosseini yes - I'll go through them this morning, no prob!

kasra-hosseini commented 3 years ago

Great, thanks @fedenanni! It seems that most of the warnings are fixed now (except for the warnings from pandas itself). I will test it later today and merge (unless you want to make any other changes?)