mojaveazure / seurat-disk

Interfaces for HDF5-based Single Cell File Formats
https://mojaveazure.github.io/seurat-disk
GNU General Public License v3.0
139 stars 44 forks source link

h5ad->Seurat:Not a sparse matrix #32

Open AlexStewart25 opened 3 years ago

AlexStewart25 commented 3 years ago

When using the covert function I get the following error using the Villani dataset from https://www.covid19cellatlas.org/: Error: Not a sparse matrix

It generates a file output but then LoadH5Seurat can't read it: Validating h5Seurat file Error in h5attr(x = self[["reductions"]][[reduc]], which = "active.assay") : Attribute does not exist

Any help much appreciated

flde commented 2 years ago

I have the same problem (R 4.1). I want to convert the output from scvi to Seurat. Convert gets everything done but fails with the error from above when trying to load Adding _scvi_extra_categoricals as cell embeddings for _scvi_extra_categoricals.

Hindrance commented 2 years ago

Hi there,

I found a workaround - I think that the obsm key "_scvi_extra_categoricals" was similar to my "_scvi_extra_continuous". As far as I'm aware, these are just the extracted metadata which were used in the scVI integration.

error: Adding _scvi_extra_continuous as cell embeddings for _scvi_extra_continuous Error: Not a sparse matrix

I solved this (partially) by firstly attempting to make the matrix sparse, In python:

import scvi
import scanpy as sc
import scipy
adata.obsm['_scvi_extra_continuous'] = scipy.sparse.csr_matrix(adata.obsm['_scvi_extra_continuous'])

adata.write_h5ad("..test.h5ad")

I converted using in R, Convert() Where a new error was found in R after reading in h5Seurat:

seuratObject <- LoadH5Seurat("..test.h5seurat")
...
Adding cell embeddings for _scvi_extra_continuous
Warning: Keys should be one or more alphanumeric characters followed by an underscore, setting key from _scvi_extra_continuous_ to scviextracontinuous_
Warning: All keys should be one or more alphanumeric characters followed by an underscore '_', setting key to scviextracontinuous_
Error in validObject(.Object) : 
  invalid class “DimReduc” object: invalid object for slot "cell.embeddings" in class "DimReduc": got class "dgCMatrix", should be or extend class "matrix"

I decided to remove the pesky obsm key in python:

python
import scvi
import scanpy as sc
import scipy

del adata.obsm['_scvi_extra_continuous']
adata.write_h5ad("..test.h5ad")

I converted using in R, Convert(). Which resulted in a successfully read Seurat object. There were a couple of warnings about key relabelling after that but that was normal.

Assuming that the "_scvi_extra_continuous", or "_scvi_extra_categoricals" data frames are redundant for downstream calculations, it's probably OK to remove them.... right?

Best of luck,