Teichlab / cellphonedb

MIT License
342 stars 105 forks source link

Invalid Counts data #306

Closed YuliaInn closed 3 years ago

YuliaInn commented 3 years ago

I have normalized my data in Seurat using "RC" method. Here is a subset of my count data. I also converted all the gene names into ensemble ids.

My command is cellphonedb method statistical_analysis metadata_LEFT.txt countsLEFT.txt --threads=6

I get an error "Invalid Counts data" sample_1.txt

prete commented 3 years ago

Hi @YuliaInn your sample looks OK. Questions that come to mind are:

YuliaInn commented 3 years ago

I used cellphone environment that somebody posted here in issues.

The content of environment.yml is:

name: cellphone dependencies:

My meta file is attached and the cell names look like they match the count's cell names.

I am running this in hpc cluster, I usually get a memory outage error. But in cellphonedb I have only "Invalid Counts data" error (screenshot of an error is attached

thank you!

Screen Shot 2021-05-05 at 11 38 35 AM ) cellphonedb_metaLEFT.txt

prete commented 3 years ago

Both meta and counts look fine. Actually, meta has some extra columns (it should only have two) but that shouldn't be an issue.

I can suggest that you re-try this with only a fraction of your counts file (about 30%) to see if it works properly. Should that go well, then it's probably a memory issue and you need to request more when you submit a job to your HPC (don't know how much really, but start with something like 30GB RAM?)

Also, could you please check both the version of cellphonedb/pandas you've got in your environment (pip show cellphonedb pandas)?

YuliaInn commented 3 years ago

I tried to run cellphonedb on just 500 cells and still got the same error. Here are the cellphonedb and pandas info:

Name: CellPhoneDB Version: 2.1.4 Summary: UNKNOWN Home-page: https://cellphonedb.org Author: TeichLab Author-email: contact@cellphonedb.org License: MIT Location: /storage/hpc/data/iid49/miniconda/envs/cellphone/lib/python3.7/site-packages Requires: click, flask, requests, Flask-Testing, boto3, geosketch, PyYAML, Flask-RESTful, rpy2, cython, tqdm, pika, SQLAlchemy, pandas Required-by:

Name: pandas Version: 0.23.4 Summary: Powerful data structures for data analysis, time series, and statistics Home-page: http://pandas.pydata.org Author: None Author-email: None License: BSD Location: /storage/hpc/data/iid49/miniconda/envs/cellphone/lib/python3.7/site-packages Requires: python-dateutil, pytz, numpy Required-by: CellPhoneDB

prete commented 3 years ago

Looks like you're using an old CellPhoneDB version, could you update to v2.1.7 (pip install -U CellPhoneDB) and try agan?

YuliaInn commented 3 years ago

thank you for your prompt responses. I found an extra column in my normalized counts data and deleted it. Now it works