ventolab / CellphoneDB

CellPhoneDB can be used to search for a particular ligand/receptor, or interrogate your own HUMAN single-cell transcriptomics data.
https://www.cellphonedb.org/
MIT License
304 stars 52 forks source link

Invalid Counts data #165

Closed ncedi12 closed 5 months ago

ncedi12 commented 6 months ago

Hi there, I'm trying to use CellPhoneDB following the notebook vignettes. The data is from a seurat object and used your vignette to extract relevant data to pass onto cpdb. I'm relatively new to python so forgive me if this is a basic error that O can't see. I tried to follow the vignette rather specifically for this reason.

inputted file formats as follows:

cpdb_file_path = 'v5.0.0/cellphoned' # we chose built and renamed database opposed to .zip.. let's see meta_file_path = 'seuratdata/MB_meta.tsv' counts_file_path = 'seuratdata/MB_counts_mtx.mtx' degs_file_path = 'seuratdata/MB_DEGs.tsv' out_path = 'results/deg_method'

checked correlation of data

## check whether barcodes in metadata are the same as counts
list(adata.obs.index).sort() == list(metadata['Cell']) \
.sort()

> True

# deg file inspect
pd.read_csv(degs_file_path, sep = '\t') \
.head(3)

cluster gene    p_val_adj   p_val   avg_log2FC  pct.1   pct.2
0   Neuronal_like   STMN2   0.0 0.0 1.424504    0.959   0.762
1   Neuronal_like   GAP43   0.0 0.0 1.354367    0.745   0.309
2   Neuronal_like   MLLT11  0.0 0.0 1.015482    0.884   0.624

## it all looked good, I have no microenvironment or tf_usage files
## ran method 3

from cellphonedb.src.core.methods import cpdb_degs_analysis_method

cpdb_results = cpdb_degs_analysis_method.call(
    cpdb_file_path = cpdb_file_path,
    meta_file_path  = meta_file_path,
    counts_file_path  = counts_file_path,
    degs_file_path = degs_file_path,
    counts_data = 'hgnc_symbol',
    score_interactions = True,
    threshold = 0.1,
    result_precision = 3,
    separator = '|',
    debug = False,
    output_path = out_path,
    output_suffix = None,
    threads = 25
)

## error

File ~/anaconda3/envs/immune/lib/python3.11/site-packages/cellphonedb/src/core/methods/cpdb_degs_analysis_method.py:97, in call(cpdb_file_path, meta_file_path, counts_file_path, degs_file_path, counts_data, output_path, microenvs_file_path, active_tfs_file_path, separator, threshold, result_precision, debug, output_suffix, score_interactions, threads)
     93 interactions, genes, complex_compositions, complexes, gene_synonym2gene_name, receptor2tfs = \
     94     db_utils.get_interactions_genes_complex(cpdb_file_path)
     96 # Load user files into memory
---> 97 counts, meta, microenvs, degs, active_tf2cell_types = file_utils.get_user_files(
     98     counts_fp=counts_file_path, meta_fp=meta_file_path, microenvs_fp=microenvs_file_path, degs_fp=degs_file_path,
     99     active_tfs_fp=active_tfs_file_path,
    100     gene_synonym2gene_name=gene_synonym2gene_name, counts_data=counts_data)
    102 # get reduced interactions (drop duplicates)
    103 interactions_reduced = interactions[['multidata_1_id', 'multidata_2_id']].drop_duplicates()

File ~/anaconda3/envs/immune/lib/python3.11/site-packages/cellphonedb/utils/file_utils.py:429, in get_user_files(counts_fp, meta_fp, microenvs_fp, degs_fp, active_tfs_fp, gene_synonym2gene_name, counts_data)
    427 loaded_user_files.append(meta_fp)
    428 # Ensure that counts values are of type float32, and that all cells in meta exist in counts
--> 429 counts = counts_preprocessors.counts_preprocessor(counts, meta)
    430 if microenvs_fp:
    431     microenvs = _load_microenvs(microenvs_fp, meta)

File ~/anaconda3/envs/immune/lib/python3.11/site-packages/cellphonedb/src/core/preprocessors/counts_preprocessors.py:24, in counts_preprocessor(counts, meta)
      7 """
      8 Ensure that counts values are of type float32, and that all cells in meta exist in counts
      9 
   (...)
     21 
     22 """
     23 if not len(counts.columns):
---> 24     raise ParseCountsException('Counts values are not decimal values', 'Incorrect file format')
     25 try:
     26     if np.any(counts.dtypes.values != np.dtype('float32')):

ParseCountsException: Invalid Counts data

Anyone who might help?

datasome commented 6 months ago

Hi ncedi12,

Thank you for using CellphoneDB.

On counts_file_path = 'seuratdata/MB_counts_mtx.mtx', please see the function _read_mtx() definition in https://github.com/ventolab/CellphoneDB/blob/master/cellphonedb/utils/file_utils.py and https://cellphonedb.readthedocs.io/en/latest/RESULTS-DOCUMENTATION.html#counts-file. What you need to pass as value of counts_file_path argument the name of the directory with mtx/barcode/features files inside it.

Hope that helps.

Best,

Robert.

ncedi12 commented 5 months ago

Thank you kindly for correcting this oversight. Problem solved.

👍