nanawei11 / Secuer

A clustering method for scRNA-seq data
MIT License
5 stars 0 forks source link

cdist_euclidean(): incompatible function arguments #2

Open markddesimone opened 2 years ago

markddesimone commented 2 years ago

Hi, I have installed seceur:

pip install secuer==1.0.7

I am accessing seceur directly within python using scanpy

import argparse
from asyncio.log import logger
from pathlib import Path
import sys
#! /usr/secuer_console/env python
#coding=gbk
import logging,os
import scanpy as sc
from nbformat import read
import sys
from secuer.secuer import (secuer, Read)
from secuer.secuerconsensus import secuerconsensus
version = '1.0.7'
import yaml
import numpy as np

My obsm['X_pca'] is:

type((ad_immune_hvg.obsm["X_pca"])),(ad_immune_hvg.obsm["X_pca"]).shape,type((ad_immune_hvg.obsm["X_pca"])[0][0])
(numpy.ndarray, (95508, 10), numpy.float32)

I am using all default arguments according to seceur_console.py

I call:

res = secuer(fea=ad_immune_hvg.obsm['X_pca'],
            distance='euclidean', # the default Choose one from [cosine,euclidean,L1,sqeuclidean]
            p=1000, # the default
            Knn=7, # the default
            clusterMethod = "Kmeans", # the default
            mode='secuer',
            eskMethod = 'subGraph', # the default
            eskResolution= 0.8, # the default
            gapth= 4 # the default
            )

and get the following error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [49], in <cell line: 1>()
----> 1 res = secuer(fea=ad_immune_hvg.obsm['X_pca'],
      2             distance='euclidean', # the default Choose one from [cosine,euclidean,L1,sqeuclidean]
      3             p=1000, # the default
      4             Knn=7, # the default
      5             clusterMethod = "Kmeans", # the default
      6             mode='secuer',
      7             eskMethod = 'subGraph', # the default
      8             eskResolution= 0.8, # the default
      9             gapth= 4 # the default
     10             )

File ~/miniconda3/envs/scvi/lib/python3.8/site-packages/secuer/secuer.py:272, in secuer(fea, Ks, distance, p, Knn, mode, eskMethod, eskResolution, addweights, seed, gapth, clusterMethod, maxTcutKmIters, cntTcutKmReps)
    269     p = N
    270 # print(p)
    271 # Get $p$ representatives by hybrid selection
--> 272 RpFea = getRepresentativesByHybridSelection(fea, p,seed=seed)
    273 # Approx.KNN
    274 # 1.partition RpFea into $cntRepCls$ rep - clusters
    275 cntRepCls = int(p ** 0.5)

File ~/miniconda3/envs/scvi/lib/python3.8/site-packages/secuer/secuer.py:134, in getRepresentativesByHybridSelection(fea, pSize, cntTimes, seed)
    132 selectIdxs = np.random.choice(N, bigPSize, replace=False)
    133 bigRpFea = fea[selectIdxs, :]
--> 134 label, RpFea = fast_kmeans_scipy(bigRpFea, pSize)  # max_iter=20
    135 return RpFea

File ~/miniconda3/envs/scvi/lib/python3.8/site-packages/secuer/secuer.py:105, in fast_kmeans_scipy(ds, k, max_iter)
    103     np.random.seed(2)
    104     cores = ds[np.random.choice(m, k, replace=False)]
--> 105 distance = pdist2_fast(ds, cores)
    106 index_min = np.argmin(distance, axis=1)
    107 if (index_min == result).all():

File ~/miniconda3/envs/scvi/lib/python3.8/site-packages/secuer/secuer.py:81, in pdist2_fast(A, B, metric)
     78     res = spatial.distance.cdist(A, B, metric='sqeuclidean', p=None, V=None,
     79                                  VI=None, w=None)
     80 elif metric == 'euclidean':
---> 81     res = spatial.distance.cdist(A, B, metric='euclidean', p=None, V=None,
     82                                  VI=None, w=None)
     83 elif metric == 'cosine':
     84     res = spatial.distance.cdist(A, B, metric='cosine', p=None, V=None,
     85                                  VI=None, w=None)

File ~/miniconda3/envs/scvi/lib/python3.8/site-packages/scipy/spatial/distance.py:2947, in cdist(XA, XB, metric, out, **kwargs)
   2945 if metric_info is not None:
   2946     cdist_fn = metric_info.cdist_func
-> 2947     return cdist_fn(XA, XB, out=out, **kwargs)
   2948 elif mstr.startswith("test_"):
   2949     metric_info = _TEST_METRICS.get(mstr, None)

TypeError: cdist_euclidean(): incompatible function arguments. The following argument types are supported:
    1. (x: object, y: object, w: object = None, out: object = None) -> numpy.ndarray

Invoked with: array([[  0.03155095,   4.260019  ,   2.0927727 , ...,   0.9246686 ,
         -0.5397982 ,  -0.6922387 ],
       [ -3.6273153 ,  10.502384  , -10.331585  , ...,  -1.1306865 ,
          0.7746241 ,   0.8274168 ],
       [ -5.904729  ,  -4.674787  ,  -9.735858  , ...,  -2.6395223 ,
         -1.7508732 ,  -0.1758273 ],
       ...,
       [ -3.0690632 ,  15.256313  ,  -3.7930858 , ...,  -0.557966  ,
         -0.5148894 ,  -0.37545112],
       [ -5.9273777 ,  -6.873936  ,  -6.7084894 , ...,   0.10121324,
         -2.7390177 ,   1.9388249 ],
       [ -4.7158155 ,  -3.6951177 ,   2.3196702 , ...,   1.076768  ,
         -0.7680586 ,  -0.67869115]], dtype=float32), array([[-6.3690352 , -4.0439634 , -1.1792709 , ...,  2.8127804 ,
        -0.41033325,  6.686249  ],
       [-4.9619193 , -4.2328386 ,  3.9679842 , ..., -0.9509096 ,
        -1.2732543 , -2.903853  ],
       [-5.1366343 , -5.346841  , -8.082489  , ...,  3.5610423 ,
        -0.79585856, -1.8859336 ],
       ...,
       [29.654089  ,  1.616171  ,  3.1877525 , ...,  1.9512532 ,
        -1.8307321 , -3.1559658 ],
       [-1.3134623 , 23.82341   ,  6.640727  , ...,  2.8947234 ,
        -1.6096454 , -0.28302294],
       [-5.373455  , -2.628137  ,  4.104837  , ..., -0.4534434 ,
         1.1465974 , -0.73796725]], dtype=float32); kwargs: out=None, p=None, V=None, VI=None, w=None

how to resolve? thank you

nanawei11 commented 2 years ago

Hello, thanks for your feedback. This issue is due to an upgrade to the function scipy.spatial.distance.cdist in Python. We have uploaded the newest version to https://github.com/nanawei11/Secuer/. You can also find it at https://pypi.org/project/secuer/1.0.11/.

Here is an example:

from secuer.secuer import secuer
from secuer.secuerconsensus import secuerconsensus
import scanpy as sc
data = sc.read('./data/Biase_pca.h5ad')
print(data)
res = secuer(fea= fea,
               p=1000)
res = secuerconsensus(fea= fea,
                      Knn=5,
                      p=1000
                      M=5)
    obs: 'celltype', 'n_genes_by_counts', 'total_counts', 'total_counts_ercc', 'pct_counts_ercc'
    var: 'dropouts', 'highly_variable', 'means', 'dispersions', 'dispersions_norm', 'mean', 'std', 'ercc', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts'
    uns: 'hvg', 'log1p', 'pca'
    obsm: 'X_pca'
    varm: 'PCs'

[2022-11-04 19:05:15] [INFO] Selecting representatives...
[2022-11-04 19:05:15] [INFO] Approximate KNN...
[2022-11-04 19:05:15] [INFO] Estimating the number of clustering...
[2022-11-04 19:05:15] [INFO] Bipartite graph partitioning...

[2022-11-04 19:05:51] [INFO] Running secuer 1
[2022-11-04 19:05:51] [INFO] Running secuer 2
[2022-11-04 19:05:51] [INFO] Running secuer 4
[2022-11-04 19:05:51] [INFO] Running secuer 3
[2022-11-04 19:05:51] [INFO] Running secuer 5
markddesimone commented 2 years ago

Thank you for providing the update and for your prompt response, it works now