satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.18k stars 891 forks source link

CreateSeuratObject function for scRNA seq data doesn't give the nCount_RNA & nFeature_RNA in the returned object ?? #8479

Closed Aavdili closed 4 months ago

Aavdili commented 4 months ago

I use publicly available scRNA seq data (folder with 3 required files: barcodes.tsv.gz, features.tsv.gz and matrix.mtx.gz)

Code:

df <- Read10X (data.dir = "../path_to _folder/")

obj <- CreateSeuratObject ( counts= df, project= "x", min.cells =3, min.features=200)

then returns just a SeuratObject not containing nCounts_RNA and nFeature_RNA, Just the barcodes in a row and a column with orig.ident.

str(obj)

Formal class 'Seurat' [package "SeuratObject"] with 13 slots ..@ assays :List of 1 .. ..$ RNA:Formal class 'Assay5' [package "SeuratObject"] with 8 slots .. .. .. ..@ layers :List of 1 .. .. .. .. ..$ counts:Formal class 'dgCMatrix' [package "Matrix"] with 6 slots .. .. .. .. .. .. ..@ i : int [1:3155750] 1 13 15 21 24 27 29 33 34 36 ... .. .. .. .. .. .. ..@ p : int [1:1071] 0 4303 7386 10028 12328 14067 18223 21687 24001 26820 ... .. .. .. .. .. .. ..@ Dim : int [1:2] 18057 1070 .. .. .. .. .. .. ..@ Dimnames:List of 2 .. .. .. .. .. .. .. ..$ : NULL .. .. .. .. .. .. .. ..$ : NULL .. .. .. .. .. .. ..@ x : num [1:3155750] 1 3 1 2 1 1 2 1 2 1 ... .. .. .. .. .. .. ..@ factors : list() .. .. .. ..@ cells :Formal class 'LogMap' [package "SeuratObject"] with 1 slot .. .. .. .. .. ..@ .Data: logi [1:1070, 1] TRUE TRUE TRUE TRUE TRUE TRUE ... .. .. .. .. .. .. ..- attr(, "dimnames")=List of 2 .. .. .. .. .. .. .. ..$ : chr [1:1070] "AAACAAGTATCTCCCA-1" "AAACAGAGCGACTCCT-1" "AAACATTTCCCGGATT-1" "AAACGGGCGTACGGGT-1" ... .. .. .. .. .. .. .. ..$ : chr "counts" .. .. .. .. .. ..$ dim : int [1:2] 1070 1 .. .. .. .. .. ..$ dimnames:List of 2 .. .. .. .. .. .. ..$ : chr [1:1070] "AAACAAGTATCTCCCA-1" "AAACAGAGCGACTCCT-1" "AAACATTTCCCGGATT-1" "AAACGGGCGTACGGGT-1" ... .. .. .. .. .. .. ..$ : chr "counts" .. .. .. ..@ features :Formal class 'LogMap' [package "SeuratObject"] with 1 slot .. .. .. .. .. ..@ .Data: logi [1:18057, 1] TRUE TRUE TRUE TRUE TRUE TRUE ... .. .. .. .. .. .. ..- attr(, "dimnames")=List of 2 .. .. .. .. .. .. .. ..$ : chr [1:18057] "RP11-34P13.7" "FO538757.2" "AP006222.2" "RP11-206L10.9" ... .. .. .. .. .. .. .. ..$ : chr "counts" .. .. .. .. .. ..$ dim : int [1:2] 18057 1 .. .. .. .. .. ..$ dimnames:List of 2 .. .. .. .. .. .. ..$ : chr [1:18057] "RP11-34P13.7" "FO538757.2" "AP006222.2" "RP11-206L10.9" ... .. .. .. .. .. .. ..$ : chr "counts" .. .. .. ..@ default : int 1 .. .. .. ..@ assay.orig: chr(0) .. .. .. ..@ meta.data :'data.frame': 18057 obs. of 0 variables .. .. .. ..@ misc :List of 1 .. .. .. .. ..$ calcN: logi FALSE .. .. .. ..@ key : chr "rna_" ..@ meta.data :'data.frame': 1070 obs. of 1 variable: .. ..$ orig.ident: Factor w/ 1 level "yolk_sac": 1 1 1 1 1 1 1 1 1 1 ... ..@ active.assay: chr "RNA" ..@ active.ident: Factor w/ 1 level "yolk_sac": 1 1 1 1 1 1 1 1 1 1 ... .. ..- attr(*, "names")= chr [1:1070] "AAACAAGTATCTCCCA-1" "AAACAGAGCGACTCCT-1" "AAACATTTCCCGGATT-1" "AAACGGGCGTACGGGT-1" ... ..@ graphs : list() ..@ neighbors : list() ..@ reductions : list() ..@ images : list() ..@ project.name: chr "yolk_sac" ..@ misc : list() ..@ version :Classes 'package_version', 'numeric_version' hidden list of 1 .. ..$ : int [1:3] 5 0 1 ..@ commands : list() ..@ tools : list()

samuel-marsh commented 4 months ago

Hi,

Not member of dev team but hopefully can be helpful. Can you provide link to public data you are using or reproducible example with matrix extracted from SeuratData object? Could also provide the output of sessionInfo()?

Thanks, Sam

Aavdili commented 4 months ago

Hi the link: https://www.ebi.ac.uk/biostudies/arrayexpress/studies/E-MTAB-11673

and used the: F158_YS_barcodes.tsv.gz, F158_YS_features.tsv.gz, F158_YS_matrix.mtx.gz, and also changed the names of the files -> without (F158_YS)

I managed to create the SeuratObject, and no error is appearing.

and when I check the meta data

view(obj@meta.data), I have just a row and one column with the orig.ident.

rownames(obj[])

[1] "MIR1302-2HG" "AL627309.1" "AL627309.5" "AP006222.2" "LINC01409" "FAM87B" "LINC01128" "LINC00115"
[9] "FAM41C" "AL645608.2" "LINC02593" "SAMD11" "NOC2L" "KLHL17" "PLEKHN1" "AL645608.7"
[17] "HES4" "ISG15" "AL645608.1" "AGRN" "AL645608.8" "RNF223" "C1orf159" "AL390719.3"
[25] "AL390719.2" "TTLL10" "TNFRSF18" "TNFRSF4" "SDF4" "B3GALT6" "C1QTNF12" "AL162741.1"
[33] "UBE2J2" "LINC01786" "SCNN1D" "ACAP3" "PUSL1" "INTS11" "CPTP" "TAS1R3"
[41] "DVL1" "MXRA8" "AURKAIP1" "CCNL2" "MRPL20-AS1" "MRPL20" "AL391244.2" "ANKRD65"
[49] "AL391244.1" "TMEM88B" "LINC01770" "VWA1" "ATAD3C" "ATAD3B" "ATAD3A" "TMEM240"
[57] "SSU72" "AL645728.1" "FNDC10" "AL691432.4" "AL691432.2" "MIB2" "MMP23B" "CDK11B"
[65] "FO704657.1" "SLC35E2B" "CDK11A" "SLC35E2A" "NADK" "GNB1" "AL109917.1" "CALML6"

colnames(obj[])

[1] "AAACCTGAGAAACCGC-WS_wEMB12142156" "AAACCTGAGAGTCTGG-WS_wEMB12142156" "AAACCTGAGATATGCA-WS_wEMB12142156" [4] "AAACCTGAGCAAATCA-WS_wEMB12142156" "AAACCTGAGCGCTCCA-WS_wEMB12142156" "AAACCTGAGCGTCAAG-WS_wEMB12142156" [7] "AAACCTGAGCTGAAAT-WS_wEMB12142156" "AAACCTGAGTACTTGC-WS_wEMB12142156" "AAACCTGAGTAGATGT-WS_wEMB12142156" [10] "AAACCTGAGTGACTCT-WS_wEMB12142156" "AAACCTGAGTGGGATC-WS_wEMB12142156" "AAACCTGCAAGTTCTG-WS_wEMB12142156" [13] "AAACCTGCAATGTAAG-WS_wEMB12142156" "AAACCTGCACATGGGA-WS_wEMB12142156" "AAACCTGCACCACCAG-WS_wEMB12142156" [16] "AAACCTGCACTTCTGC-WS_wEMB12142156" "AAACCTGCAGATCCAT-WS_wEMB12142156" "AAACCTGCAGCTCGCA-WS_wEMB12142156" [19] "AAACCTGCATGCCTAA-WS_wEMB12142156" "AAACCTGCATTCCTGC-WS_wEMB12142156" "AAACCTGGTAAATGTG-WS_wEMB12142156" [22] "AAACCTGGTACGAAAT-WS_wEMB12142156" "AAACCTGGTAGAGGAA-WS_wEMB12142156" "AAACCTGGTAGCTCCG-WS_wEMB12142156"

samuel-marsh commented 4 months ago

Hi,

Thanks for the link. So I'm unable to replicate your issue either with or without renaming the files:

library(tidyverse)
library(Seurat)
library(scCustomize)

> list.files("~/Downloads/E-MTAB-11673/")
[1] "F158_YS_barcodes.tsv.gz" "F158_YS_features.tsv.gz" "F158_YS_matrix.mtx.gz"  

> test <- Read10X_GEO("~/Downloads/E-MTAB-11673/")
Reading 10X files from directory
  |==================================================| 100% elapsed=23s  

> list.files("~/Downloads/testing_rename/")
[1] "barcodes.tsv.gz" "features.tsv.gz" "matrix.mtx.gz"  

> test2 <- Read10X("~/Downloads/testing_rename/")

> seurat <- CreateSeuratObject(counts = test, min.cells = 3, min.features = 200)
Warning: Data is of class dgTMatrix. Coercing to dgCMatrix.
> seurat2 <- CreateSeuratObject(counts = test2, min.cells = 3, min.features = 200)

> head(seurat@meta.data)
                                    orig.ident nCount_RNA nFeature_RNA
AAACCTGAGAAACCGC-WS_wEMB12142156 SeuratProject       1647          777
AAACCTGAGAGTCTGG-WS_wEMB12142156 SeuratProject       1246          490
AAACCTGAGATATGCA-WS_wEMB12142156 SeuratProject        989          557
AAACCTGAGCAAATCA-WS_wEMB12142156 SeuratProject        993          362
AAACCTGAGCGCTCCA-WS_wEMB12142156 SeuratProject       1799         1106
AAACCTGAGCGTCAAG-WS_wEMB12142156 SeuratProject       2120         1295

> head(seurat2@meta.data)
                                    orig.ident nCount_RNA nFeature_RNA
AAACCTGAGAAACCGC-WS_wEMB12142156 SeuratProject       1647          777
AAACCTGAGAGTCTGG-WS_wEMB12142156 SeuratProject       1246          490
AAACCTGAGATATGCA-WS_wEMB12142156 SeuratProject        989          557
AAACCTGAGCAAATCA-WS_wEMB12142156 SeuratProject        993          362
AAACCTGAGCGCTCCA-WS_wEMB12142156 SeuratProject       1799         1106
AAACCTGAGCGTCAAG-WS_wEMB12142156 SeuratProject       2120         1295

I would suggest making sure Seurat and your other packages are fully updated and trying to read data in again.

Best, Sam

Aavdili commented 4 months ago

Thank you for getting back to me so quickly.

I was rechecking the packages, and I think it was due to the tidyverse (1.3.2), I updated it to the (2.0.0) and the problem was solved.

I never thought that because our cluster is maintained from the bioinformatic core facility (according to them the packages are always up to date).

Nevertheless. thank you again

Best