GregorySchwartz / too-many-cells

Cluster single cells and analyze cell clade relationships with colorful visualizations.
https://gregoryschwartz.github.io/too-many-cells/
GNU General Public License v3.0
104 stars 19 forks source link

Error when using matrix-path with cellranger .gz files #22

Closed mcfefa closed 4 years ago

mcfefa commented 4 years ago

I'm running this on the cluster, where our cluster admins created a singularity instance of the docker version of too-many-cells. I was able to start running this with an input for a .csv file, but ran into memory issues, so wanted to try on just a small subset with 2 samples. I tried loading the samples as shown in the workshop tutorial:

singularity run /share/data2/applications/singularity_images/too-many-cells.sif make-tree \
>   --matrix-path /share/lab/me/scRNAseq/sample1/outs/filtered_feature_bc_matrix/ \
>   --matrix-path /share/lab/me/scRNAseq/sample2/outs/filtered_feature_bc_matrix/ \
>   --output outTest20200420 \
>   > clustersTest.csv

and got the following error message:

Error in load(name, envir = .GlobalEnv) : 
  bad restore file magic number (file may be corrupted) -- no data loaded
Calls: sys.load.image -> load
In addition: Warning message:
file ‘.RData’ has magic number 'RDX3'
  Use of save versions prior to 2 is deprecated 
Execution halted
too-many-cells: readCreateProcess: R "-e" "cat(R.home())" "--quiet" "--slave" (exit 1): failed

The CellRanger datafiles are .gz files, but the documentation indicates that these can be read by too-many-cells. Please advise on what I should to do remedy this issue. I'm not trying to use the tool in R, but this seems like an R error.

GregorySchwartz commented 4 years ago

Yes, too-many-cells can read matrix.mtx.gz, features.tsv.gz, and barcodes.tsv.gz files. too-many-cells uses R (in the docker image) for plotting some statistics and differential analysis after the tree is made. I suspect it has to do with how you made your singularity image. You would probably run into the same error with the csv file with that singularity image. What happens if you run the docker too-many-cells on the same data?

mcfefa commented 4 years ago

I first tried with a csv and was able to load the data and presumably start computation but ran into memory errors, which I'm troubleshooting with the HPC people at my institution.

Docker is not permitted on HPC systems due to security concerns, so I cannot test that specifically. I could however use pre-built haskell binaries. See #14.

GregorySchwartz commented 4 years ago

Can you test on a different computer?

mcfefa commented 4 years ago

No.

Can we use a newer version of R (3.50 or greater)? If so, we can rebuild the singularity instance with it and see if that works.

GregorySchwartz commented 4 years ago

The Dockerfile is available in the repo if you wish to test different R versions.

GregorySchwartz commented 4 years ago

@mcfefa I have package too-many-cells for nix, I recommend trying that out (see the documentation). It's a reproducible derivation which should take care of all dependencies and only requires root once when installing nix.