GuyAllard / markov_clustering

markov clustering in python
MIT License
163 stars 37 forks source link

Error when using inflation argument with a value different than pre-defined #32

Open gema-sanz opened 3 years ago

gema-sanz commented 3 years ago

Hi, When I try to set different inflation values I get this error. My adjacency matrix does not contain NA values but it has negative ones.

/Users/gema/opt/anaconda3/lib/python3.8/site-packages/markov_clustering/mcl.py:38: RuntimeWarning: invalid value encountered in power
  return normalize(np.power(matrix, power))
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-134-d5960f70e34a> in <module>
      2 # for each clustering run, calculate the modularity
      3 for inflation in [i / 10 for i in range(15, 26)]:
----> 4     result = mc.run_mcl(df, inflation=inflation)
      5     clusters = mc.get_clusters(result)
      6     Q = mc.modularity(matrix=result, clusters=clusters)

~/opt/anaconda3/lib/python3.8/site-packages/markov_clustering/mcl.py in run_mcl(matrix, expansion, inflation, loop_value, iterations, pruning_threshold, pruning_frequency, convergence_check_frequency, verbose)
    226 
    227         # perform MCL expansion and inflation
--> 228         matrix = iterate(matrix, expansion, inflation)
    229 
    230         # prune

~/opt/anaconda3/lib/python3.8/site-packages/markov_clustering/mcl.py in iterate(matrix, expansion, inflation)
    133 
    134     # Inflation
--> 135     matrix = inflate(matrix, inflation)
    136 
    137     return matrix

~/opt/anaconda3/lib/python3.8/site-packages/markov_clustering/mcl.py in inflate(matrix, power)
     36         return normalize(matrix.power(power))
     37 
---> 38     return normalize(np.power(matrix, power))
     39 
     40 

~/opt/anaconda3/lib/python3.8/site-packages/markov_clustering/mcl.py in normalize(matrix)
     21     :returns: The normalized matrix
     22     """
---> 23     return sklearn.preprocessing.normalize(matrix, norm="l1", axis=0)
     24 
     25 

~/opt/anaconda3/lib/python3.8/site-packages/sklearn/utils/validation.py in inner_f(*args, **kwargs)
     61             extra_args = len(args) - len(all_args)
     62             if extra_args <= 0:
---> 63                 return f(*args, **kwargs)
     64 
     65             # extra_args > 0

~/opt/anaconda3/lib/python3.8/site-packages/sklearn/preprocessing/_data.py in normalize(X, norm, axis, copy, return_norm)
   1902         raise ValueError("'%d' is not a supported axis" % axis)
   1903 
-> 1904     X = check_array(X, accept_sparse=sparse_format, copy=copy,
   1905                     estimator='the normalize function', dtype=FLOAT_DTYPES)
   1906     if axis == 0:

~/opt/anaconda3/lib/python3.8/site-packages/sklearn/utils/validation.py in inner_f(*args, **kwargs)
     61             extra_args = len(args) - len(all_args)
     62             if extra_args <= 0:
---> 63                 return f(*args, **kwargs)
     64 
     65             # extra_args > 0

~/opt/anaconda3/lib/python3.8/site-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator)
    718 
    719         if force_all_finite:
--> 720             _assert_all_finite(array,
    721                                allow_nan=force_all_finite == 'allow-nan')
    722 

~/opt/anaconda3/lib/python3.8/site-packages/sklearn/utils/validation.py in _assert_all_finite(X, allow_nan, msg_dtype)
    101                 not allow_nan and not np.isfinite(X).all()):
    102             type_err = 'infinity' if allow_nan else 'NaN, infinity'
--> 103             raise ValueError(
    104                     msg_err.format
    105                     (type_err,

ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
micans commented 2 years ago

Input to mcl should be non-negative similarities, the only solution to this is to change your input. If you have anti-correlations that you wish to take into account, then I would suggest to simply take the absolute value, and other types of negative input may suggest other treatments (simply using a cut-off of zero being one of them). Late reply, and this is not my project (although I have worked a lot with mcl), but perhaps this is useful to someone.