Open rockdeme opened 1 year ago
I'm trying to run the scGCO pipeline and the execution seems stuck when I start running identify_spatial_genes.
identify_spatial_genes
Running ubuntu 20.04 on wsl2. I also needed to change some functions as some of them were deprecated.
Here is my script:
import matplotlib import numpy as np import pandas as pd import scanpy as sc import matplotlib.pyplot as plt from scGCO import * # to_scipy_sparse_matrix is deprecated from networkx.convert_matrix import to_scipy_sparse_array nx.to_scipy_sparse_matrix = to_scipy_sparse_array # override the default scGCO function as multi-dimensional indexing in pandas is not supported anymore def normalize_count_cellranger(data, Log=True): ''' normalize count as in cellranger :param file: data: A dataframe of shape (m, n); :rtype: data shape (m, n); ''' normalizing_factor = np.sum(data, axis=1) / np.median(np.sum(data, axis=1)) data = pd.DataFrame(data.values, columns=data.columns, index=data.index) data = data / normalizing_factor[0] if Log == True: data = log1p(data) else: data = data return data data_path = '/my/folder/' adata = sc.datasets.visium_sge(sample_id='V1_Human_Lymph_Node') adata.var_names_make_unique() sc.pp.calculate_qc_metrics(adata, inplace=True) sc.pp.filter_cells(adata, min_counts=6000) sc.pp.filter_genes(adata, min_cells=10) j=11 unary_scale_factor=100 label_cost=10 algorithm='expansion' data = adata.to_df().astype(int) locs = adata.obsm['spatial'] data_norm = normalize_count_cellranger(data) fig, ax = plt.subplots(1, 1, figsize=(5, 5))) ax.set_aspect('equal') exp = data_norm.iloc[:, 0].values cellGraph = create_graph_with_weight(locs, exp) ax.scatter(locs[:, 0], locs[:, 1], s=1, color='black') for i in np.arange(cellGraph.shape[0]): x = (locs[int(cellGraph[i, 0]), 0], locs[int(cellGraph[i, 1]), 0]) y = (locs[int(cellGraph[i, 0]), 1], locs[int(cellGraph[i, 1]), 1]) ax.plot(x, y, color='black', linewidth=0.5) plt.title('CellGraph') plt.show() t0=time.time() gmmDict = gmm_model(data_norm) print('GMM time(s): ', time.time()-t0) t0= time.time() result_df = identify_spatial_genes(locs, data_norm, cellGraph, gmmDict) print('Running time: {} seconds'.format(time.time()-t0))
According to the tutorial the last step should take approximately the same time as gmm_model but it seems to be idle for hours.
gmm_model
Output:
> GMM time(s): 183.92914414405823 > scGCO used 8 out of 16 cores > 0%| | 0/8 [00:00<?, ?it/s]
I am also getting exactly this issue! Not sure what's going on. Thank you for opening this issue in any case.
Thanks for your opening this issue. Sorry for that. Please reinstall scGCO. pip install -U scGCO.
I'm trying to run the scGCO pipeline and the execution seems stuck when I start running
identify_spatial_genes
.Running ubuntu 20.04 on wsl2. I also needed to change some functions as some of them were deprecated.
Here is my script:
According to the tutorial the last step should take approximately the same time as
gmm_model
but it seems to be idle for hours.Output: