mojaveazure / loomR

An R-based interface for loom files
63 stars 16 forks source link

Seurat to LoomR: Error in attributes[[i]] : subscript out of bounds #36

Open RobertAlpin opened 5 years ago

RobertAlpin commented 5 years ago

I'm not sure if this is a problem necessarily for Seurat or LoomR, but when I try to convert my seurat object to a loom file, I get the following output:


Attaching package: 'dplyr'

The following objects are masked from 'package:stats':

    filter, lag

The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

Loading required package: R6
Loading required package: hdf5r

Attaching package: 'loomR'

The following object is masked from 'package:dplyr':

    combine

Read matrix
added names
turned sparse
             used    (Mb) gc trigger    (Mb)   max used    (Mb)
Ncells   11256948   601.2   17353204   926.8   11269653   601.9
Vcells 1961739302 14966.9 5414633012 41310.4 5008710673 38213.5
removed the old matrix
An object of class Seurat 
26183 features across 2058652 samples within 1 assay 
Active assay: RNA (26183 features)
made object
Performing log-normalization
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
normalized data
Transposing input data: loom file will show input columns (cells) as rows and input rows (features) as columns
This is to maintain compatibility with other loom tools

[Lots of loading]                  
  |======================================================================| 100%
Adding: CellID
Adding: Gene
Adding a layer to norm_data (layer 1 of 1)

[Lots of loading]         
  |======================================================================| 100%
Error in attributes[[i]] : subscript out of bounds
Calls: as.loom ... as.loom.Seurat -> <Anonymous> -> <Anonymous> -> <Anonymous>

I suspected at first this might have been caused by create.names putting underscores in gene names (which Seurat should automatically be able to compensate for), but using gsub to remove those before creating a Seurat object didn't help. Does anyone know why I might be getting this error?

R is version 3.5.3. LoomR, Seurat are both the latest development versions and HDF5r is the most recent full release.

cakirb commented 5 years ago

I also have the same error when I try to save Seurat object as loom file, but I don't have any message says "Execution halted". I have also checked if it is caused by non-unique gene symbols or cell names, but I get the same error after fixing that too. Additionally, I have an error while reading Seurat-created loom file in scanpy with a different dataset. https://github.com/theislab/scanpy/issues/598

RobertAlpin commented 5 years ago

Apologies, the "Execution halted" line is an artifact of the server I'm running this on. I'll edit the post to remove it to avoid future confusion.

cakirb commented 5 years ago

I could successfully save loom file after running Seurat workflow. @RobertAlpin

mojaveazure commented 5 years ago

Hi @RobertAlpin,

Sorry for the delay; could you provide the object, or a downsampled version of it, that I could test this on? I haven't seen this before, so I don't know where to begin.

RobertAlpin commented 5 years ago

I've cut down my expression matrix to the first 20 cells (or at least I hope I did,removing columns in R has been acting funny on me). The Seurat object was made from the larger version of this file.

cut_down_matrix.zip

BenjaminDoran commented 5 years ago

I did some troubleshooting and it seems like the issue might be with needing the srobj$RNA@meta.features table to have columns. I tested converting to loom, at each stage of the standard workflow:

pbmc <- NormalizeData(object = pbmc)
pbmc <- FindVariableFeatures(object = pbmc) # works with only this step done
pbmc <- ScaleData(object = pbmc)
pbmc <- RunPCA(object = pbmc)
pbmc <- FindNeighbors(object = pbmc)
pbmc <- FindClusters(object = pbmc)
pbmc <- RunTSNE(object = pbmc)
DimPlot(object = pbmc, reduction = "tsne")

And it works as soon as the variable features are selected.

The output then writes

Transposing input data: loom file will show input columns (cells) as rows and input rows (features) as columns
This is to maintain compatibility with other loom tools
  |======================================================================| 100%
Adding: CellID
Adding: Gene
Adding a layer to norm_data (layer 1 of 1)
  |======================================================================| 100%
Adding: vst_mean
Adding: vst_variance
Adding: vst_variance_expected
Adding: vst_variance_standardized
Adding: vst_variable
Adding: Selected
Adding: orig_ident
Adding: nCount_RNA
Adding: nFeature_RNA
Adding: ClusterID
Adding: ClusterName
No scaled data present, not adding scaled data, dimensional reduction information, or neighbor graphs
MichaelPeibo commented 5 years ago

Thanks! It works! @BenjaminDoran

hmassalha commented 5 years ago

Hi @BenjaminDoran Would please give an example of how to change the srobj$RNA@meta.features to be a table? what kind of information do you add in? (I am using Seurat 3)

Thanks, HM

BenjaminDoran commented 5 years ago

Just do:

srobj <- FindVariableFeatures(object = srobj)

Seurat's function adds the data

hmassalha commented 5 years ago

Thanks.. working for me. HM

ArcusGears commented 4 years ago

Out of curiosity is there a way to get the as.loom() function to run without normalizing and finding the variable features in Seurat? I was hoping to export the .loom file to test it out in Scanpy and I would prefer to normalize the data with their function to keep any differences between the two from causing errors.

If not I can try doing so in Seurat and seeing what happens.

swokybio commented 4 years ago

@BenjaminDoran @cakirb

Can you please provide more information on how you fixed the issue?

I am having the same problem. Getting an error when trying to convert an integrated Seurat object into a loom file:

I am using the code below to create the loom file:

singlecell.combined.loom <- as.loom(singlecell.combined, filename = "/PATH/singlecell.combined.loom", verbose = FALSE)

The output writes:

Transposing input data: loom file will show input columns (cells) as rows and input rows (features) as columns
This is to maintain compatibility with other loom tools
Adding: CellID
Adding: Gene
Error in attributes[[i]] : subscript out of bounds
dagarfield commented 4 years ago

@JoaoGabrielMoraes , it seems as simple as needing something in the VariableFeatures slot. When as.loom does the conversion, it crashes if there's nothing found in that slot. So just run FindVariableFeatures and you're good to go. You could probably also just put a single gene name in there if you wanted.

micdonato commented 4 years ago

I have the same concern as @ArcusGears . I would like to export the raw data and have it normalized by ScanPy. However, I suspect that Scanpy will use the raw data from the Loom object, and that "finding the variable features" is just a way to put something in there so that Loom does not crash.

Am I understanding correctly?