Open brookmay opened 2 years ago
Hi @brookmay - thank you for the bug report.
I have a couple of questions to help diagnose next steps:
Also, as you have already determined, the --backed
flag is very unlikely to influence this behavior, as that flag affects the application behavior after the file is available (ie, already downloaded and available for reading).
I have not been able to locally reproduce (tested this with a 3GB H5AD on S3, running cellxgene on my laptop). Will need a bit more info to help diagnose.
Side note: looking at the code that is likely involved, (DataLocator.local_handle()
), we could definitely do a better job of reporting errors if they occur -- that might help diagnose this type of failure.
Hi @bkmartinjr,
It's weird cause I do have space on my local machine -
~$ df -h /tmp
Filesystem Size Used Avail Capacity iused ifree %iused Mounted on
/dev/disk1s1 466Gi 377Gi 61Gi 87% 11040905 634988320 2% /System/Volumes/Data
We're also running cellxgene on docker container via cellxgene-gateway (https://github.com/Novartis/cellxgene-gateway) which is also using cellxgene version 1.0.1.
On the container, I can see some h5ad files in /tmp.
[ec2-user@i-xyzxzyzyz ~]$ docker exec -it et9573bd1 /bin/bash
(base) [docker@et9573bd1 ~]$ ls -l /tmp/
total 12
-rw-------. 1 docker docker 0 Jul 6 21:35 cellxgene__1eukuuu.h5ad
-rw-------. 1 docker docker 0 Jul 6 21:35 cellxgene_kp9_uhqn.h5ad
-rw-------. 1 docker docker 0 Jul 7 12:20 cellxgene_nt405nm_.h5ad
-rw-------. 1 docker docker 0 Jul 6 21:35 cellxgene_rs_9h_jb.h5ad
-rwx------. 1 root root 701 Sep 15 2021 ks-script-4luisyla
-rwx------. 1 root root 671 Sep 15 2021 ks-script-o23i7rc2
-rwx------. 1 root root 291 Sep 15 2021 ks-script-x6ei4wuu
And the /tmp directory has space too -
(base) [docker@et9573bd1 ~]$ df -h /tmp/
Filesystem Size Used Avail Use% Mounted on
overlay 42G 13G 27G 33% /
@bkmartinjr Do you have any publicly readable s3 files that I can try out?
Hi @brookmay, we're actively investigating this and we have two questions that could help us:
cellxgene
in Docker?Hi @ebezzi, to answer your questions -
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O
1abc container1 0.01% 58.27MiB / 61.98GiB 0.09% 2.49GB / 416MB 0B / 0B
2xyz container2 0.01% 61.74MiB / 10GiB 0.60% 6.82GB / 471MB 0B / 0B
I tried to launch cellxgene using s3 url on both and both failed
Do you have any publicly readable s3 files that I can try out?
I have temporarily put a very large (4.8GB, 1M+ cell) H5AD here: s3://czi.bruce-public/tmp/be48f323-749f-4ac4-b95e-51831778eca1.h5ad
Please let me know the results of your test (and so I can delete it when you are finished).
I have confirmed it works fine when launched from my laptop (albeit slowly, as it had to download):
$ python --version
Python 3.9.7
$ cellxgene --version
[cellxgene] Version 1.0.1
$ cellxgene launch --verbose s3://czi.bruce-public/tmp/be48f323-749f-4ac4-b95e-51831778eca1.h5ad
[cellxgene] Starting the CLI...
[cellxgene] Loading data from be48f323-749f-4ac4-b95e-51831778eca1.h5ad.
[cellxgene] Warning: Anndata data matrix is sparse, but not a CSC (columnar) matrix. Performance may be improved by using CSC.
[cellxgene] Warning: Obs annotation 'sample' has 1001288 categories, this may be cumbersome or slow to display. We recommend setting the --max-category-items option to 500, this will hide categorical annotations with more than 500 categories in the UI
[cellxgene] Warning: Var annotation 'feature_name' has 46483 categories, this may be cumbersome or slow to display. We recommend setting the --max-category-items option to 500, this will hide categorical annotations with more than 500 categories in the UI
WARNING:root:Type float64 will be converted to 32 bit float and may lose precision.
WARNING:root:Type float64 will be converted to 32 bit float and may lose precision.
WARNING:root:Type float64 will be converted to 32 bit float and may lose precision.
[cellxgene] CAUTION: due to the size of your dataset, running differential expression may take longer or fail.
[cellxgene] Launching! Please go to http://localhost:5005 in your browser.
[cellxgene] Type CTRL-C at any time to exit.
If this doesn't work, I suspect @ebezzi will need to provide you with instrumented version to test. Based on the info above, it appears to be failing during the S3 download (before it tries to open the file, it copies to the tmp directory).
Could you also provide us with the output of pip list
and pip --version
so that we can see the package versions you use to run cellxgene?
@brookmay I have prepared a version of cellxgene
with additional logging that will hopefully help debug your issue. If you can reach out to me at ebezzi@chanzuckerberg.com, I will send you the package.
Describe the bug Cellxgene fails to launch s3 datasets (..h5ad files) that are larger than 1 GB
To Reproduce Steps to reproduce the behavior: I'm using cellxgene version 1.0.1. We have .h5ad files ranging from 300 MBs in size to 6-7 GB each. I experience no issues launching the files that are < 1GB in size from s3, but for some reason, for all files > 1GB throw "Error: File not found or is inaccessible. File must be an .h5ad object. Please check your input and try again." I have tried the parameter --backed, but it still fails.
For example, below is the cellxgene command and output I get for a file about 4.5 GB in size -
Note: These files launch fine if they're first locally downloaded, but we want to be able to launch cellxgene using s3 urls for our project.
Version (please complete the following information):