cellgeni / sceasy

A package to help convert different single-cell data formats to each other
GNU General Public License v3.0
363 stars 53 forks source link

Can't save object (assay = SCT, slote = scale.data) because of meta.features #8

Open mariafiruleva opened 4 years ago

mariafiruleva commented 4 years ago

Dear sceasy team,

I have Seurat object and I want to convert it to h5ad using assay = SCT, slot = scale.data.

sceasy:::seurat2anndata(object, outFile="object_scaled.h5ad",
                                 assay="SCT", main_layer="scale.data")

it gives me the error:

Error in py_call_impl(callable, dots$args, dots$keywords) : 
  ValueError: Variables annot. `var` must have number of columns of `X` (3000), but has 18224 rows.

Detailed traceback: 
  File "/nfs/home/mfiruleva/anaconda3/lib/python3.7/site-packages/anndata/base.py", line 672, in __init__
    filename=filename, filemode=filemode)
  File "/nfs/home/mfiruleva/anaconda3/lib/python3.7/site-packages/anndata/base.py", line 874, in _init_as_actual
    self._check_dimensions()
  File "/nfs/home/mfiruleva/anaconda3/lib/python3.7/site-packages/anndata/base.py", line 1944, in _check_dimensions
    .format(self._n_vars, self._var.shape[0]))
Calls: <Anonymous> -> <Anonymous> -> py_call_impl

The scale.data slot store 3000 variable genes (by default):

> str(object@assays$SCT@scale.data)
 num [1:3000, 1:3011] -0.513 -0.151 2.367 -0.237 -0.108 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:3000] "B4galt6" "Robo2" "Emid1" "Dock4" ...
  ..$ : chr [1:3011] "AAACCTGAGGCCATAG" "AAACCTGAGGCGATAC" "AAACCTGCATCGGGTC" "AAACGGGGTCCCTACT"

I found that the reason is the meta.features dataframe: it contains information for all the genes:

> str(object@assays$SCT@meta.features)
'data.frame':   18224 obs. of  5 variables:
 $ sct.detection_rate   : num  0.0727 0.0877 0.1388 0.0143 0.284 ...
 $ sct.gmean            : num  0.0577 0.0674 0.1161 0.0105 0.2668 ...
 $ sct.variance         : num  0.113 0.118 0.26 0.018 0.535 ...
 $ sct.residual_variance: num  0.919 0.885 1.59 1.623 1.013 ...
 $ sct.variable         : logi  FALSE FALSE TRUE TRUE FALSE FALSE ...

If I run something like this one:

object@assays$SCT@meta.features <- object@assays$SCT@meta.features %>%
  tibble::rownames_to_column('gene') %>%
  filter(gene %in% rownames(object@assays$SCT@scale.data)) %>%
  tibble::column_to_rownames('gene')

Everything works fine for me.

Best wishes, Maria