Teichlab / cellphonedb

MIT License
338 stars 105 forks source link

The count file for cellphonedb #321

Open chengyuye opened 3 years ago

chengyuye commented 3 years ago

Hi there,

I got a large sc dataset that includes more than 100000 cells. So when generating the input count file for cellphonedb, i alway got the same error: Error in asMethod(object) : Cholmod error 'problem too large' at file ../Core/cholmod_dense.c, line 105. Do you have any suggestions on how to generate the count file for cellphonedb? Thank you so much!

prete commented 3 years ago

Hi @henryeahhh, what's the original format of your data? Can you convert it to h5ad using something like sceasy and work with that as an input count file?

chengyuye commented 3 years ago

Hi @prete , thanks for your reply. The original format of my data is Seurat format. I made an intial try to convert seurat to scanpy. However, the instruction on how to extract count file from scanpy seemed not work in my case.

chengyuye commented 3 years ago

To be more specific, i followed the instruction on the cellphonedb website. However, the h5ad format generated by 'sceasy' has no attribute called 'norm' or 'set_index'. Thus, it seemed that it cannot generate the input files for cellphonedb. Sorry, i'm not very farmilar with scanpy.

prete commented 3 years ago

I see, you're trying to extract the txt/tsv counts file from the generated h5ad as explained here.

Starting with v2.1.7, CellPhoneDB allows h5ad files as input to the command line. If you've got a h5ad file generated with sceasy, and your original Seurat file was normalised non-log transformed, you can simply:

cellphonedb method statistical_analysis meta.txt counts.h5ad
chengyuye commented 3 years ago

I see, you're trying to extract the txt/tsv counts file from the generated h5ad as explained here.

Starting with v2.1.7, CellPhoneDB allows h5ad files as input to the command line. If you've got a h5ad file generated with sceasy, and your original Seurat file was normalised non-log transformed, you can simply:

cellphonedb method statistical_analysis meta.txt counts.h5ad

Thank you so much! I'll have a try following your suggestions! Thank you again!

chengyuye commented 3 years ago

I see, you're trying to extract the txt/tsv counts file from the generated h5ad as explained here.

Starting with v2.1.7, CellPhoneDB allows h5ad files as input to the command line. If you've got a h5ad file generated with sceasy, and your original Seurat file was normalised non-log transformed, you can simply:

cellphonedb method statistical_analysis meta.txt counts.h5ad

Hi @prete , thank you very much for the suggestion and it worked. I used subsampling option for my data and the code i used is:`cellphonedb method statistical_analysis liver_meta.txt liver_annotated.h5ad --counts-data=gene_name --subsampling --subsampling-log false --subsampling-num-cells 3000--threads=64 However, i still want to report two issues i encountered.

  1. 1: it seems cpdb can sucessfully be run even if the orignical Seurat object is not normalised non-log transformed. I tried one with RC normalisation and another with default log normalisation. Both workded. I'm not sure if this will have some impact on the output results.
  2. 2: Although cpdb analysis can be run and output files have been generated, the "cellphonedb plot heatmap_plot" cannot work across all my datasets and all give the following error messages:

"R[write to console]: Error in [.data.frame(all_intr, , pairs1[i]) : undefined columns selected

[ ][APP][25/06/21-12:28:14][ERROR] R Runtime Exception: Error in [.data.frame(all_intr, , pairs1[i]) : undefined columns selected"

Could you please give any information for these issues? Thank you so much!

prete commented 3 years ago

Hi @henryeahhh I think that error pops up when some cluster interaction is filtered or differs from the ones in the meta file. You can try this notebook to get the same heat map but using python (seaborn package). May be that's a more friendly approach for you.