Teichlab / cellphonedb

MIT License
342 stars 105 forks source link

Invalid Count Data #347

Closed mfrisoli126 closed 3 years ago

mfrisoli126 commented 3 years ago

Hi, cellphonedb statistical analysis works with the test_meta.txt test_counts.txt data on my laptop, but it does not work with any of my own data. What am I doing wrong?

Here is a glimpse of my experimental metadata.txt:

Screen Shot 2021-09-14 at 5 49 38 PM

Here's a glimpse of my experimental counts_by_entrezid.txt:

Screen Shot 2021-09-14 at 5 50 22 PM

And here is the error message that I keep getting:

(cpdb) (base) michaelfrisoli@Michaels-MacBook-Pro-2 Single_Cell_Analysis % cellphonedb method statistical_analysis metadata.txt counts_by_entrezid.txt 
/Users/michaelfrisoli/PycharmProjects/Single_Cell_Analysis/cpdb/lib/python3.8/site-packages/sklearn/utils/deprecation.py:144: FutureWarning: The sklearn.cluster.k_means_ module is  deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.cluster. Anything that cannot be imported from sklearn.cluster is now part of the private API.
  warnings.warn(message, FutureWarning)
[ ][APP][14/09/21-17:52:03][WARNING] Latest local available version is `v2.0.0`, using it
[ ][APP][14/09/21-17:52:03][WARNING] User selected downloaded database `v2.0.0` is available, using it
[ ][CORE][14/09/21-17:52:03][INFO] Initializing SqlAlchemy CellPhoneDB Core
[ ][CORE][14/09/21-17:52:03][INFO] Using custom database at /Users/michaelfrisoli/.cpdb/releases/v2.0.0/cellphone.db
[ ][APP][14/09/21-17:52:03][INFO] Launching Method cpdb_statistical_analysis_local_method_launcher
[ ][APP][14/09/21-17:52:03][INFO] Launching Method _set_paths
[ ][APP][14/09/21-17:52:03][WARNING] Output directory (/Users/michaelfrisoli/PycharmProjects/Single_Cell_Analysis/out) exist and is not empty. Result can overwrite old results
[ ][APP][14/09/21-17:52:03][INFO] Launching Method _load_meta_counts
[ ][CORE][14/09/21-17:52:44][INFO] Launching Method cpdb_statistical_analysis_launcher
[ ][CORE][14/09/21-17:52:44][INFO] Launching Method _counts_validations
[ ][CORE][14/09/21-17:52:59][INFO] [Cluster Statistical Analysis] Threshold:0.1 Iterations:1000 Debug-seed:-1 Threads:4 Precision:3
[ ][APP][14/09/21-17:53:00][ERROR] Unexpected error
Traceback (most recent call last):
  File "/Users/michaelfrisoli/PycharmProjects/Single_Cell_Analysis/cpdb/lib/python3.8/site-packages/cellphonedb/src/api_endpoints/terminal_api/method_terminal_api_endpoints/method_terminal_commands.py", line 127, in statistical_analysis
    LocalMethodLauncher(cpdb_app.create_app(verbose, database)). \
  File "/Users/michaelfrisoli/PycharmProjects/Single_Cell_Analysis/cpdb/lib/python3.8/site-packages/cellphonedb/src/local_launchers/local_method_launcher.py", line 54, in cpdb_statistical_analysis_local_method_launcher
    self.cellphonedb_app.method.cpdb_statistical_analysis_launcher(
  File "/Users/michaelfrisoli/PycharmProjects/Single_Cell_Analysis/cpdb/lib/python3.8/site-packages/cellphonedb/src/core/methods/method_launcher.py", line 63, in cpdb_statistical_analysis_launcher
    cpdb_statistical_analysis_method.call(meta,
  File "/Users/michaelfrisoli/PycharmProjects/Single_Cell_Analysis/cpdb/lib/python3.8/site-packages/cellphonedb/src/core/methods/cpdb_statistical_analysis_method.py", line 23, in call
    cpdb_statistical_analysis_complex_method.call(meta.copy(),
  File "/Users/michaelfrisoli/PycharmProjects/Single_Cell_Analysis/cpdb/lib/python3.8/site-packages/cellphonedb/src/core/methods/cpdb_statistical_analysis_complex_method.py", line 64, in call
    raise NoInteractionsFound()
cellphonedb.src.core.exceptions.NoInteractionsFound.NoInteractionsFound: No CellPhoneDB interacions found in this input.

Any help would be very much appreciated!

Thank you!

mfrisoli126 commented 3 years ago

Based on the threads from other people with similar problems, I have made sure that the cell ID names are identical between both input txt files, and I am running the cellphonedb command from the same folder. I put my experimental data txt files in the same folder as the test txt files.

prete commented 3 years ago

Hi @mfrisoli126 thank you for using CellPhoneDB. One quick test you can try is to drop the second column Lesion from your meta file and leave only columns: Cells and cell_type.

When reading the meta file, CellPhoneDB does not look into the columns names, it uses a positional indexing. First column is expected to be cells (matching column headers in counts) and the second column is expected to have the clustering ("cell_type")

If you want to leave your extra column in there, you can keep it by moving it to the right after cell_type. Final order should be something like: Cells|cell_type|Lesion.

Let us know if that works for you.

mfrisoli126 commented 3 years ago

Thanks so much @prete for responding so quickly! I really appreciate it!

I removed my Lesion column, but I still get the same error:

(cpdb) (base) michaelfrisoli@Michaels-MacBook-Pro-2 Single_Cell_Analysis % cellphonedb method statistical_analysis metadata.txt counts_by_entrezid.txt 
/Users/michaelfrisoli/PycharmProjects/Single_Cell_Analysis/cpdb/lib/python3.8/site-packages/sklearn/utils/deprecation.py:144: FutureWarning: The sklearn.cluster.k_means_ module is  deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.cluster. Anything that cannot be imported from sklearn.cluster is now part of the private API.
  warnings.warn(message, FutureWarning)
[ ][APP][14/09/21-20:04:11][WARNING] Latest local available version is `v2.0.0`, using it
[ ][APP][14/09/21-20:04:11][WARNING] User selected downloaded database `v2.0.0` is available, using it
[ ][CORE][14/09/21-20:04:11][INFO] Initializing SqlAlchemy CellPhoneDB Core
[ ][CORE][14/09/21-20:04:11][INFO] Using custom database at /Users/michaelfrisoli/.cpdb/releases/v2.0.0/cellphone.db
[ ][APP][14/09/21-20:04:11][INFO] Launching Method cpdb_statistical_analysis_local_method_launcher
[ ][APP][14/09/21-20:04:11][INFO] Launching Method _set_paths
[ ][APP][14/09/21-20:04:11][WARNING] Output directory (/Users/michaelfrisoli/PycharmProjects/Single_Cell_Analysis/out) exist and is not empty. Result can overwrite old results
[ ][APP][14/09/21-20:04:11][INFO] Launching Method _load_meta_counts
[ ][CORE][14/09/21-20:04:53][INFO] Launching Method cpdb_statistical_analysis_launcher
[ ][CORE][14/09/21-20:04:53][INFO] Launching Method _counts_validations
[ ][CORE][14/09/21-20:05:10][INFO] [Cluster Statistical Analysis] Threshold:0.1 Iterations:1000 Debug-seed:-1 Threads:4 Precision:3
[ ][APP][14/09/21-20:05:11][ERROR] Unexpected error
Traceback (most recent call last):
  File "/Users/michaelfrisoli/PycharmProjects/Single_Cell_Analysis/cpdb/lib/python3.8/site-packages/cellphonedb/src/api_endpoints/terminal_api/method_terminal_api_endpoints/method_terminal_commands.py", line 127, in statistical_analysis
    LocalMethodLauncher(cpdb_app.create_app(verbose, database)). \
  File "/Users/michaelfrisoli/PycharmProjects/Single_Cell_Analysis/cpdb/lib/python3.8/site-packages/cellphonedb/src/local_launchers/local_method_launcher.py", line 54, in cpdb_statistical_analysis_local_method_launcher
    self.cellphonedb_app.method.cpdb_statistical_analysis_launcher(
  File "/Users/michaelfrisoli/PycharmProjects/Single_Cell_Analysis/cpdb/lib/python3.8/site-packages/cellphonedb/src/core/methods/method_launcher.py", line 63, in cpdb_statistical_analysis_launcher
    cpdb_statistical_analysis_method.call(meta,
  File "/Users/michaelfrisoli/PycharmProjects/Single_Cell_Analysis/cpdb/lib/python3.8/site-packages/cellphonedb/src/core/methods/cpdb_statistical_analysis_method.py", line 23, in call
    cpdb_statistical_analysis_complex_method.call(meta.copy(),
  File "/Users/michaelfrisoli/PycharmProjects/Single_Cell_Analysis/cpdb/lib/python3.8/site-packages/cellphonedb/src/core/methods/cpdb_statistical_analysis_complex_method.py", line 64, in call
    raise NoInteractionsFound()
cellphonedb.src.core.exceptions.NoInteractionsFound.NoInteractionsFound: No CellPhoneDB interacions found in this input.

Any other ideas? I'm fairly experienced coding in R, but I am completely new to python so i'm not sure how to debug this on my own.

prete commented 3 years ago

I asume you're using a recent version of CellPhoneDB, but just in case, could you check pip show cellphonedb and see if it shows v2.1.7?

May be your all of your Genes are not part of CellPhoneDB, that would be most unusual... but possible I guess. Is there any chance you could share your meta and counts with us?

mfrisoli126 commented 3 years ago

yup, running v2.1.7 of CellphoneDB:

(cpdb) (base) michaelfrisoli@Michaels-MacBook-Pro-2 Single_Cell_Analysis % pip show cellphonedb
Name: CellPhoneDB
Version: 2.1.7
Summary: Inferring cell-cell communication
Home-page: https://cellphonedb.org
Author: TeichLab
Author-email: contact@cellphonedb.org
License: MIT
Location: /Users/michaelfrisoli/PycharmProjects/Single_Cell_Analysis/cpdb/lib/python3.8/site-packages
Requires: Flask-RESTful, geosketch, scikit-learn, Flask-Testing, h5py, numpy, pandas, pika, SQLAlchemy, requests, rpy2, anndata, tqdm, flask, click, PyYAML, boto3
Required-by: 

I suppose I may have made an error in the process of converting gene symbols to ENTREZID numbers too... Let me know what you think.

Thank you again for your help with this! I'm a PHD student and I greatly appreciate your help with this. I loved everything that I read about cellphoneDB, so i'm determined to find a way to utilize it for my analysis.

prete commented 3 years ago

Had a quick look at the data and from your ~1500 genes in counts only 2 (ENSG00000283951 and ENSG00000006638) are in the CellPhone database. As far as I know, Ensembl and Entrez are both Gene-Databases but use different IDs and CellPhoneDB expects Ensembl Stable IDs.

You may want to have a second look at that, possibly keep your original gene names or HGNC symbols and re run it using the option --counts-data=gene_name or --counts-data=hgnc_symbol respectively.

mfrisoli126 commented 3 years ago

Oh that was my mistake! I thought Ensembl IDs were the same as Entrez IDs. That's why I even labeled the dataframe as "counts_by_entrezid". I'll convert my symbols to Ensembl ID and hopefully that will fix everything.

Thanks very much for taking a look at this and helping me!

mfrisoli126 commented 3 years ago

Yup, cellphoneDB is working for me now that i converted to Ensembl IDs. Thanks for helping me with that silly mistake of mine. I appreciate it!