Teichlab / cellphonedb

MIT License
339 stars 105 forks source link

'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte #273

Open rachel662 opened 3 years ago

rachel662 commented 3 years ago

Hi there,

I've just tried this with a h5.ad counts file and I get this error Is this something to do with my version of pandas? with a txt file it works fine

thanks Rachel

prete commented 3 years ago

Hi @rachel662 could you please provide the command you used to launch cellphonedb? Additionally, It would be useful to know what versions of anndata/pandas/cellphonedb you're using. You can checked that with: pip show cellphonedb pandas anndata

rachel662 commented 3 years ago

hi there, thanks so much for your help - here are the versions i'm running

CellPhoneDB Version: 2.1.6

Name: pandas Version: 1.2.2

Name: anndata Version: 0.7.5

cheers Rachel

prete commented 3 years ago

CellPhoneDB 2.1.6 pandas 1.2.2 anndata 0.7.5

That looks about right, could you please provide the command you used to launch cellphonedb?

rachel662 commented 3 years ago

i did: python -m venv cpdb source cpdb/bin/activate pip install cellphonedb cellphonedb method analysis meta.txt adata.h5ad (what i've called the counts file)

again, thanks for your help Rachel

prete commented 3 years ago

It may be related to the encoding of your file. Could you try with this test meta and h5ad and see if you get the same error? test_meta_and_count_h5ad.zip

That will help us rule out dependency errors or that kind of issues.

rachel662 commented 3 years ago

Hi there, I have tried this with the test counts h5ad and test meta file and I still get the same error

thanks Rachel

zktuong commented 3 years ago

Hi @rachel662, what is your h5py version?

rachel662 commented 3 years ago

Hi there, sorry it's taken me so long to get back to you, here's my h5py version

Name: h5py Version: 2.10.0

thanks so much! Rachel

prete commented 3 years ago

Hi @rachel662 could you try upgrading to the beta version (pip install -U CellPhoneDB==2.1.8b3) and see if you're still facing this issue?

rachel662 commented 3 years ago

Hi there I tried this, however it didn't work, I think this might be because I haven't written the metadata file as a .h5ad file though?

cheers Rachel

prete commented 3 years ago

Hi @rachel662 meta should still be a .txt/csv/.tsv file. Is there any chance you could share your meta and counts files with us so we can have a better look at this issue?

prete commented 3 years ago

I tried this, however it didn't work, I think this might be because I haven't written the metadata file as a .h5ad file though?

Hi @rachel662 not sure what you meant by that. Did you eventually manage to get it working?

mibo1996 commented 3 years ago

Hi @prete, I am also getting the same error when I try to read in my h5 or h5ad files.

CellphoneDB works on the command line when I use the test meta and counts data, however when I try to use my own data, I get the error: UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte

The command I try running when I get this error is: cellphonedb method statistical_analysis s_8944_meta_tab.txt s_8944_counts.h5ad

I get the same error when I also try running: cellphonedb method statistical_analysis s_8944_meta_tab.txt S_8944_filtered_feature_bc_matrix.h5

The meta data accounts for only a subset of the barcodes that are in the entire counts data (I am only trying to observe interactions between two clusters, so in the meta data there are only the barcodes in these two annotated clusters, but the counts data has all barcodes. I'm not sure if this matters)

I've also tried inputting a folder ("raw_feature_bc_matrix") that contains my barcodes.tsv, features.tsv, and matrix.mtx data, however when I try running: cellphonedb method statistical_analysis s_8944_meta_tab.txt /raw_feature_bc_matrix ...

... I get the error: [ ][APP][22/07/21-14:51:18][ERROR] Can not read /raw_feature_bc_matrix

Please help with any of these issues if possible.

Thank you

prete commented 3 years ago

Hi @mibo1996 could you please confirm which versions of cellphonedb and anndata are you using (pip show cellphonedb anndata)?

I've tried reproducing this error but failed. Would you be able to share any of your input files with us for debugging?

mibo1996 commented 3 years ago

hi @prete here is the output:

WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested Name: CellPhoneDB Version: 2.1.1 Summary: UNKNOWN Home-page: https://cellphonedb.org Author: TeichLab Author-email: contact@cellphonedb.org License: MIT Location: /usr/local/lib/python3.7/site-packages Requires: Flask-RESTful, Flask-Testing, SQLAlchemy, pandas, PyYAML, pika, flask, tqdm, boto3, geosketch, rpy2, click, requests Required-by:

Yes I could share my input file with you

prete commented 3 years ago

Thank you for the fast reply. I can see you're using v2.1.1 and h5ad support was introduced in version 2.1.6 I'd first recommend you try to update CellPhoneDB to the latest version (2.1.7) using pip install -U cellphonedb and try running your command once again. If that also fails then we can have a look at your input file.