satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.25k stars 904 forks source link

Problem with RunPCA when specify `features = VariableFeatures(obj)` #8946

Closed bellenger-l closed 2 months ago

bellenger-l commented 3 months ago

Hello !

When I look closer to RunPCA, I found a weird behavior. In order to scale all genes to be able to plot a Heatmap on any genes, I run the following code :

obj <- ScaleData(obj, features = rownames(obj))
obj <- RunPCA(obj, features = VariableFeatures(obj)

But it doesn't return the same result as if I run with default parameters. I have less genes that are used for the PCA (in the feature loadings). I see this issue with multiple datasets. For instance with the pbmc dataset :

## Import packages
library(Seurat)

## Import expression matrix
system("wget -P ./test-data/ https://cf.10xgenomics.com/samples/cell/pbmc3k/pbmc3k_filtered_gene_bc_matrices.tar.gz")
system("tar -zxvf ./test-data/pbmc3k_filtered_gene_bc_matrices.tar.gz -C ./test-data/")
tenX_matrix <- Read10X(data.dir = "./test-data/filtered_gene_bc_matrices/hg19", #Path to the directory containing the three 10X files (matrix.mtx, genes.tsv (or features.tsv depending on the CellRanger version), barcodes.tsv)
                       gene.column = 1) #Choice of the genes.tsv column that will determine the names of the genes in the expression matrix

## Creation of the Seurat Object
pbmc_small <- CreateSeuratObject(tenX_matrix,                    #Expression matrix
                                 project = "PBMC analysis",      #Name of the project : something meaningful for your dataset
                                 assay = "RNA",                  #Name of the initial assay, (others can be created during the analysis), default is RNA
                                 names.field = 1,                #Associated with the names.delim, it allows to separate the name of the barcode when this one is composed of several information ex: BARCODE_CLUSTER_CELLTYPE and to choose after split on the names.delim which part we choose as the barcode name
                                 names.delim = "_",              #Character that allows to dissociate the different information contained in the barcode names
                                 meta.data = NULL,               #We can add the metadata on the transcriptomes with a dataframe where the barcodes are in line and the different information in column
                                 min.cells = 0,                  #Filtering genes that are not detected in less than min.cells
                                 min.features = 1)               #Filtering cells that do not detect at least min.features, here we filter all barcodes that detect no gene

## Inter-cell normalization
pbmc_small <- NormalizeData(pbmc_small,                                   #SeuratObject
                            assay = "RNA",                                #Assay to use
                            normalization.method = "LogNormalize",        #Normalization method
                            scale.factor = median(pbmc_small$nCount_RNA), #Scale factor
                            verbose = TRUE)

###########################################################################
##                         HIGHLY VARIABLE GENES                             ##
###########################################################################

pbmc_small <- FindVariableFeatures(pbmc_small,                 #SeuratObject
                                   selection.method = "vst",   #Method
                                   nfeatures = 2000)           #Top HVG (Highly Variable Gene), default value
pbmc_small

## Plot
VariableFeaturePlot(pbmc_small)

###############################################################################
##                         REDUCTION OF DIMENSIONALITY                       ##
###############################################################################

## Scaling data
pbmc_small <- ScaleData(pbmc_small, features = rownames(pbmc_small))

## PCA (Principal Component Analysis)
pbmc_small <- RunPCA(pbmc_small,                 #SeuratObject
                     reduction.name = "pca",     #Name of the reduction stored in the reduction slot
                     npcs = 50,                  #Total Number of PCs to compute and store (50 by default)
                     seed.use = 42,              #Set a random seed. By default, sets the seed to 42.
                     verbose = FALSE, features = VariableFeatures(pbmc_small))

Here is the result :

> pbmc_small
An object of class Seurat 
32738 features across 2700 samples within 1 assay 
Active assay: RNA (32738 features, 2000 variable features)
 3 layers present: counts, data, scale.data
 1 dimensional reduction calculated: pca
> str(pbmc_small)
Formal class 'Seurat' [package "SeuratObject"] with 13 slots
  ..@ assays      :List of 1
  .. ..$ RNA:Formal class 'Assay5' [package "SeuratObject"] with 8 slots
  .. .. .. ..@ layers    :List of 3
  .. .. .. .. ..$ counts    :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
  .. .. .. .. .. .. ..@ i       : int [1:2286884] 70 166 178 326 363 410 412 492 494 495 ...
  .. .. .. .. .. .. ..@ p       : int [1:2701] 0 781 2133 3264 4224 4746 5528 6311 7101 7634 ...
  .. .. .. .. .. .. ..@ Dim     : int [1:2] 32738 2700
  .. .. .. .. .. .. ..@ Dimnames:List of 2
  .. .. .. .. .. .. .. ..$ : NULL
  .. .. .. .. .. .. .. ..$ : NULL
  .. .. .. .. .. .. ..@ x       : num [1:2286884] 1 1 2 1 1 1 1 41 1 1 ...
  .. .. .. .. .. .. ..@ factors : list()
  .. .. .. .. ..$ data      :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
  .. .. .. .. .. .. ..@ i       : int [1:2286884] 70 166 178 326 363 410 412 492 494 495 ...
  .. .. .. .. .. .. ..@ p       : int [1:2701] 0 781 2133 3264 4224 4746 5528 6311 7101 7634 ...
  .. .. .. .. .. .. ..@ Dim     : int [1:2] 32738 2700
  .. .. .. .. .. .. ..@ Dimnames:List of 2
  .. .. .. .. .. .. .. ..$ : NULL
  .. .. .. .. .. .. .. ..$ : NULL
  .. .. .. .. .. .. ..@ x       : num [1:2286884] 0.646 0.646 1.035 0.646 0.646 ...
  .. .. .. .. .. .. ..@ factors : list()
  .. .. .. .. ..$ scale.data: num [1:32738, 1:2700] 0 0 0 0 0 ...
  .. .. .. ..@ cells     :Formal class 'LogMap' [package "SeuratObject"] with 1 slot
  .. .. .. .. .. ..@ .Data: logi [1:2700, 1:3] TRUE TRUE TRUE TRUE TRUE TRUE ...
  .. .. .. .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. .. .. .. ..$ : chr [1:2700] "AAACATACAACCAC-1" "AAACATTGAGCTAC-1" "AAACATTGATCAGC-1" "AAACCGTGCTTCCG-1" ...
  .. .. .. .. .. .. .. ..$ : chr [1:3] "counts" "data" "scale.data"
  .. .. .. .. .. ..$ dim     : int [1:2] 2700 3
  .. .. .. .. .. ..$ dimnames:List of 2
  .. .. .. .. .. .. ..$ : chr [1:2700] "AAACATACAACCAC-1" "AAACATTGAGCTAC-1" "AAACATTGATCAGC-1" "AAACCGTGCTTCCG-1" ...
  .. .. .. .. .. .. ..$ : chr [1:3] "counts" "data" "scale.data"
  .. .. .. ..@ features  :Formal class 'LogMap' [package "SeuratObject"] with 1 slot
  .. .. .. .. .. ..@ .Data: logi [1:32738, 1:3] TRUE TRUE TRUE TRUE TRUE TRUE ...
  .. .. .. .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. .. .. .. ..$ : chr [1:32738] "ENSG00000243485" "ENSG00000237613" "ENSG00000186092" "ENSG00000238009" ...
  .. .. .. .. .. .. .. ..$ : chr [1:3] "counts" "data" "scale.data"
  .. .. .. .. .. ..$ dim     : int [1:2] 32738 3
  .. .. .. .. .. ..$ dimnames:List of 2
  .. .. .. .. .. .. ..$ : chr [1:32738] "ENSG00000243485" "ENSG00000237613" "ENSG00000186092" "ENSG00000238009" ...
  .. .. .. .. .. .. ..$ : chr [1:3] "counts" "data" "scale.data"
  .. .. .. ..@ default   : int 1
  .. .. .. ..@ assay.orig: chr(0) 
  .. .. .. ..@ meta.data :'data.frame': 32738 obs. of  8 variables:
  .. .. .. .. ..$ vf_vst_counts_mean                 : num [1:32738] 0 0 0 0 0 ...
  .. .. .. .. ..$ vf_vst_counts_variance             : num [1:32738] 0 0 0 0 0 ...
  .. .. .. .. ..$ vf_vst_counts_variance.expected    : num [1:32738] 0 0 0 0 0 ...
  .. .. .. .. ..$ vf_vst_counts_variance.standardized: num [1:32738] 0 0 0 0 0 ...
  .. .. .. .. ..$ vf_vst_counts_variable             : logi [1:32738] FALSE FALSE FALSE FALSE FALSE FALSE ...
  .. .. .. .. ..$ vf_vst_counts_rank                 : int [1:32738] NA NA NA NA NA NA NA NA NA NA ...
  .. .. .. .. ..$ var.features                       : chr [1:32738] NA NA NA NA ...
  .. .. .. .. ..$ var.features.rank                  : int [1:32738] NA NA NA NA NA NA NA NA NA NA ...
  .. .. .. ..@ misc      : Named list()
  .. .. .. ..@ key       : chr "rna_"
  ..@ meta.data   :'data.frame':    2700 obs. of  3 variables:
  .. ..$ orig.ident  : Factor w/ 1 level "PBMC analysis": 1 1 1 1 1 1 1 1 1 1 ...
  .. ..$ nCount_RNA  : num [1:2700] 2421 4903 3149 2639 981 ...
  .. ..$ nFeature_RNA: int [1:2700] 781 1352 1131 960 522 782 783 790 533 550 ...
  ..@ active.assay: chr "RNA"
  ..@ active.ident: Factor w/ 1 level "PBMC analysis": 1 1 1 1 1 1 1 1 1 1 ...
  .. ..- attr(*, "names")= chr [1:2700] "AAACATACAACCAC-1" "AAACATTGAGCTAC-1" "AAACATTGATCAGC-1" "AAACCGTGCTTCCG-1" ...
  ..@ graphs      : list()
  ..@ neighbors   : list()
  ..@ reductions  :List of 1
  .. ..$ pca:Formal class 'DimReduc' [package "SeuratObject"] with 9 slots
  .. .. .. ..@ cell.embeddings           : num [1:2700, 1:50] 2.87 0.89 1.58 -8.36 2.25 ...
  .. .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. .. ..$ : chr [1:2700] "AAACATACAACCAC-1" "AAACATTGAGCTAC-1" "AAACATTGATCAGC-1" "AAACCGTGCTTCCG-1" ...
  .. .. .. .. .. ..$ : chr [1:50] "PC_1" "PC_2" "PC_3" "PC_4" ...
  .. .. .. ..@ feature.loadings          : num [1:1128, 1:50] -0.179165 -0.00896 0.002358 -0.000301 -0.000295 ...
  .. .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. .. ..$ : chr [1:1128] "ENSG00000087086" "ENSG00000127920" "ENSG00000253701" "ENSG00000176890" ...
  .. .. .. .. .. ..$ : chr [1:50] "PC_1" "PC_2" "PC_3" "PC_4" ...
  .. .. .. ..@ feature.loadings.projected: num[0 , 0 ] 
  .. .. .. ..@ assay.used                : chr "RNA"
  .. .. .. ..@ global                    : logi FALSE
  .. .. .. ..@ stdev                     : num [1:50] 4.88 3.41 2.9 2.49 2.11 ...
  .. .. .. ..@ jackstraw                 :Formal class 'JackStrawData' [package "SeuratObject"] with 4 slots
  .. .. .. .. .. ..@ empirical.p.values     : num[0 , 0 ] 
  .. .. .. .. .. ..@ fake.reduction.scores  : num[0 , 0 ] 
  .. .. .. .. .. ..@ empirical.p.values.full: num[0 , 0 ] 
  .. .. .. .. .. ..@ overall.p.values       : num[0 , 0 ] 
  .. .. .. ..@ misc                      :List of 1
  .. .. .. .. ..$ total.variance: num 876
  .. .. .. ..@ key                       : chr "PC_"
  ..@ images      : list()
  ..@ project.name: chr "PBMC analysis"
  ..@ misc        : list()
  ..@ version     :Classes 'package_version', 'numeric_version'  hidden list of 1
  .. ..$ : int [1:3] 5 0 1
  ..@ commands    :List of 4
  .. ..$ NormalizeData.RNA       :Formal class 'SeuratCommand' [package "SeuratObject"] with 5 slots
  .. .. .. ..@ name       : chr "NormalizeData.RNA"
  .. .. .. ..@ time.stamp : POSIXct[1:1], format: "2024-05-28 10:53:56"
  .. .. .. ..@ assay.used : chr "RNA"
  .. .. .. ..@ call.string: chr [1:2] "NormalizeData(pbmc_small, assay = \"RNA\", normalization.method = \"LogNormalize\", " "    scale.factor = median(pbmc_small$nCount_RNA), verbose = TRUE)"
  .. .. .. ..@ params     :List of 5
  .. .. .. .. ..$ assay               : chr "RNA"
  .. .. .. .. ..$ normalization.method: chr "LogNormalize"
  .. .. .. .. ..$ scale.factor        : num 2197
  .. .. .. .. ..$ margin              : num 1
  .. .. .. .. ..$ verbose             : logi TRUE
  .. ..$ FindVariableFeatures.RNA:Formal class 'SeuratCommand' [package "SeuratObject"] with 5 slots
  .. .. .. ..@ name       : chr "FindVariableFeatures.RNA"
  .. .. .. ..@ time.stamp : POSIXct[1:1], format: "2024-05-28 10:53:58"
  .. .. .. ..@ assay.used : chr "RNA"
  .. .. .. ..@ call.string: chr [1:2] "FindVariableFeatures(pbmc_small, selection.method = \"vst\", " "    nfeatures = 2000)"
  .. .. .. ..@ params     :List of 12
  .. .. .. .. ..$ assay              : chr "RNA"
  .. .. .. .. ..$ selection.method   : chr "vst"
  .. .. .. .. ..$ loess.span         : num 0.3
  .. .. .. .. ..$ clip.max           : chr "auto"
  .. .. .. .. ..$ mean.function      :function (mat, display_progress)  
  .. .. .. .. ..$ dispersion.function:function (mat, display_progress)  
  .. .. .. .. ..$ num.bin            : num 20
  .. .. .. .. ..$ binning.method     : chr "equal_width"
  .. .. .. .. ..$ nfeatures          : num 2000
  .. .. .. .. ..$ mean.cutoff        : num [1:2] 0.1 8
  .. .. .. .. ..$ dispersion.cutoff  : num [1:2] 1 Inf
  .. .. .. .. ..$ verbose            : logi TRUE
  .. ..$ ScaleData.RNA           :Formal class 'SeuratCommand' [package "SeuratObject"] with 5 slots
  .. .. .. ..@ name       : chr "ScaleData.RNA"
  .. .. .. ..@ time.stamp : POSIXct[1:1], format: "2024-05-28 10:54:03"
  .. .. .. ..@ assay.used : chr "RNA"
  .. .. .. ..@ call.string: chr "ScaleData(pbmc_small, features = rownames(pbmc_small))"
  .. .. .. ..@ params     :List of 10
  .. .. .. .. ..$ features          : chr [1:32738] "ENSG00000243485" "ENSG00000237613" "ENSG00000186092" "ENSG00000238009" ...
  .. .. .. .. ..$ assay             : chr "RNA"
  .. .. .. .. ..$ model.use         : chr "linear"
  .. .. .. .. ..$ use.umi           : logi FALSE
  .. .. .. .. ..$ do.scale          : logi TRUE
  .. .. .. .. ..$ do.center         : logi TRUE
  .. .. .. .. ..$ scale.max         : num 10
  .. .. .. .. ..$ block.size        : num 1000
  .. .. .. .. ..$ min.cells.to.block: num 3000
  .. .. .. .. ..$ verbose           : logi TRUE
  .. ..$ RunPCA.RNA              :Formal class 'SeuratCommand' [package "SeuratObject"] with 5 slots
  .. .. .. ..@ name       : chr "RunPCA.RNA"
  .. .. .. ..@ time.stamp : POSIXct[1:1], format: "2024-05-28 10:54:12"
  .. .. .. ..@ assay.used : chr "RNA"
  .. .. .. ..@ call.string: chr [1:2] "RunPCA(pbmc_small, reduction.name = \"pca\", npcs = 50, " "    seed.use = 42, verbose = FALSE, features = VariableFeatures(pbmc_small))"
  .. .. .. ..@ params     :List of 11
  .. .. .. .. ..$ assay          : chr "RNA"
  .. .. .. .. ..$ features       : chr [1:2000] "ENSG00000163736" "ENSG00000090382" "ENSG00000163220" "ENSG00000254709" ...
  .. .. .. .. ..$ npcs           : num 50
  .. .. .. .. ..$ rev.pca        : logi FALSE
  .. .. .. .. ..$ weight.by.var  : logi TRUE
  .. .. .. .. ..$ verbose        : logi FALSE
  .. .. .. .. ..$ ndims.print    : int [1:5] 1 2 3 4 5
  .. .. .. .. ..$ nfeatures.print: num 30
  .. .. .. .. ..$ reduction.name : chr "pca"
  .. .. .. .. ..$ reduction.key  : chr "PC_"
  .. .. .. .. ..$ seed.use       : num 42
  ..@ tools       : list()

But if I run with default parameters

## Scaling data
pbmc_small <- ScaleData(pbmc_small)

## PCA (Principal Component Analysis)
pbmc_small <- RunPCA(pbmc_small,                 #SeuratObject
                     reduction.name = "pca",     #Name of the reduction stored in the reduction slot
                     npcs = 50,                  #Total Number of PCs to compute and store (50 by default)
                     seed.use = 42,              #Set a random seed. By default, sets the seed to 42.
                     verbose = FALSE)

Then I have the expected result with the right number of genes in the feature loadings :

> str(pbmc_small)
Formal class 'Seurat' [package "SeuratObject"] with 13 slots
  ..@ assays      :List of 1
  .. ..$ RNA:Formal class 'Assay5' [package "SeuratObject"] with 8 slots
  .. .. .. ..@ layers    :List of 3
  .. .. .. .. ..$ counts    :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
  .. .. .. .. .. .. ..@ i       : int [1:2286884] 70 166 178 326 363 410 412 492 494 495 ...
  .. .. .. .. .. .. ..@ p       : int [1:2701] 0 781 2133 3264 4224 4746 5528 6311 7101 7634 ...
  .. .. .. .. .. .. ..@ Dim     : int [1:2] 32738 2700
  .. .. .. .. .. .. ..@ Dimnames:List of 2
  .. .. .. .. .. .. .. ..$ : NULL
  .. .. .. .. .. .. .. ..$ : NULL
  .. .. .. .. .. .. ..@ x       : num [1:2286884] 1 1 2 1 1 1 1 41 1 1 ...
  .. .. .. .. .. .. ..@ factors : list()
  .. .. .. .. ..$ data      :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
  .. .. .. .. .. .. ..@ i       : int [1:2286884] 70 166 178 326 363 410 412 492 494 495 ...
  .. .. .. .. .. .. ..@ p       : int [1:2701] 0 781 2133 3264 4224 4746 5528 6311 7101 7634 ...
  .. .. .. .. .. .. ..@ Dim     : int [1:2] 32738 2700
  .. .. .. .. .. .. ..@ Dimnames:List of 2
  .. .. .. .. .. .. .. ..$ : NULL
  .. .. .. .. .. .. .. ..$ : NULL
  .. .. .. .. .. .. ..@ x       : num [1:2286884] 0.646 0.646 1.035 0.646 0.646 ...
  .. .. .. .. .. .. ..@ factors : list()
  .. .. .. .. ..$ scale.data: num [1:2000, 1:2700] -0.7581 -0.253 1.3258 -0.0435 -0.44 ...
  .. .. .. ..@ cells     :Formal class 'LogMap' [package "SeuratObject"] with 1 slot
  .. .. .. .. .. ..@ .Data: logi [1:2700, 1:3] TRUE TRUE TRUE TRUE TRUE TRUE ...
  .. .. .. .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. .. .. .. ..$ : chr [1:2700] "AAACATACAACCAC-1" "AAACATTGAGCTAC-1" "AAACATTGATCAGC-1" "AAACCGTGCTTCCG-1" ...
  .. .. .. .. .. .. .. ..$ : chr [1:3] "counts" "data" "scale.data"
  .. .. .. .. .. ..$ dim     : int [1:2] 2700 3
  .. .. .. .. .. ..$ dimnames:List of 2
  .. .. .. .. .. .. ..$ : chr [1:2700] "AAACATACAACCAC-1" "AAACATTGAGCTAC-1" "AAACATTGATCAGC-1" "AAACCGTGCTTCCG-1" ...
  .. .. .. .. .. .. ..$ : chr [1:3] "counts" "data" "scale.data"
  .. .. .. ..@ features  :Formal class 'LogMap' [package "SeuratObject"] with 1 slot
  .. .. .. .. .. ..@ .Data: logi [1:32738, 1:3] TRUE TRUE TRUE TRUE TRUE TRUE ...
  .. .. .. .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. .. .. .. ..$ : chr [1:32738] "ENSG00000243485" "ENSG00000237613" "ENSG00000186092" "ENSG00000238009" ...
  .. .. .. .. .. .. .. ..$ : chr [1:3] "counts" "data" "scale.data"
  .. .. .. .. .. ..$ dim     : int [1:2] 32738 3
  .. .. .. .. .. ..$ dimnames:List of 2
  .. .. .. .. .. .. ..$ : chr [1:32738] "ENSG00000243485" "ENSG00000237613" "ENSG00000186092" "ENSG00000238009" ...
  .. .. .. .. .. .. ..$ : chr [1:3] "counts" "data" "scale.data"
  .. .. .. ..@ default   : int 1
  .. .. .. ..@ assay.orig: chr(0) 
  .. .. .. ..@ meta.data :'data.frame': 32738 obs. of  8 variables:
  .. .. .. .. ..$ vf_vst_counts_mean                 : num [1:32738] 0 0 0 0 0 ...
  .. .. .. .. ..$ vf_vst_counts_variance             : num [1:32738] 0 0 0 0 0 ...
  .. .. .. .. ..$ vf_vst_counts_variance.expected    : num [1:32738] 0 0 0 0 0 ...
  .. .. .. .. ..$ vf_vst_counts_variance.standardized: num [1:32738] 0 0 0 0 0 ...
  .. .. .. .. ..$ vf_vst_counts_variable             : logi [1:32738] FALSE FALSE FALSE FALSE FALSE FALSE ...
  .. .. .. .. ..$ vf_vst_counts_rank                 : int [1:32738] NA NA NA NA NA NA NA NA NA NA ...
  .. .. .. .. ..$ var.features                       : chr [1:32738] NA NA NA NA ...
  .. .. .. .. ..$ var.features.rank                  : int [1:32738] NA NA NA NA NA NA NA NA NA NA ...
  .. .. .. ..@ misc      : Named list()
  .. .. .. ..@ key       : chr "rna_"
  ..@ meta.data   :'data.frame':    2700 obs. of  3 variables:
  .. ..$ orig.ident  : Factor w/ 1 level "PBMC analysis": 1 1 1 1 1 1 1 1 1 1 ...
  .. ..$ nCount_RNA  : num [1:2700] 2421 4903 3149 2639 981 ...
  .. ..$ nFeature_RNA: int [1:2700] 781 1352 1131 960 522 782 783 790 533 550 ...
  ..@ active.assay: chr "RNA"
  ..@ active.ident: Factor w/ 1 level "PBMC analysis": 1 1 1 1 1 1 1 1 1 1 ...
  .. ..- attr(*, "names")= chr [1:2700] "AAACATACAACCAC-1" "AAACATTGAGCTAC-1" "AAACATTGATCAGC-1" "AAACCGTGCTTCCG-1" ...
  ..@ graphs      : list()
  ..@ neighbors   : list()
  ..@ reductions  :List of 1
  .. ..$ pca:Formal class 'DimReduc' [package "SeuratObject"] with 9 slots
  .. .. .. ..@ cell.embeddings           : num [1:2700, 1:50] -4.57 -1.16 -3.16 11.57 -2.9 ...
  .. .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. .. ..$ : chr [1:2700] "AAACATACAACCAC-1" "AAACATTGAGCTAC-1" "AAACATTGATCAGC-1" "AAACCGTGCTTCCG-1" ...
  .. .. .. .. .. ..$ : chr [1:50] "PC_1" "PC_2" "PC_3" "PC_4" ...
  .. .. .. ..@ feature.loadings          : num [1:2000, 1:50] 0.00971 0.13072 0.12319 -0.00847 -0.02023 ...
  .. .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. .. ..$ : chr [1:2000] "ENSG00000163736" "ENSG00000090382" "ENSG00000163220" "ENSG00000254709" ...
  .. .. .. .. .. ..$ : chr [1:50] "PC_1" "PC_2" "PC_3" "PC_4" ...
  .. .. .. ..@ feature.loadings.projected: num[0 , 0 ] 
  .. .. .. ..@ assay.used                : chr "RNA"
  .. .. .. ..@ global                    : logi FALSE
  .. .. .. ..@ stdev                     : num [1:50] 6.68 4.35 3.83 3.65 2.79 ...
  .. .. .. ..@ jackstraw                 :Formal class 'JackStrawData' [package "SeuratObject"] with 4 slots
  .. .. .. .. .. ..@ empirical.p.values     : num[0 , 0 ] 
  .. .. .. .. .. ..@ fake.reduction.scores  : num[0 , 0 ] 
  .. .. .. .. .. ..@ empirical.p.values.full: num[0 , 0 ] 
  .. .. .. .. .. ..@ overall.p.values       : num[0 , 0 ] 
  .. .. .. ..@ misc                      :List of 1
  .. .. .. .. ..$ total.variance: num 1571
  .. .. .. ..@ key                       : chr "PC_"
  ..@ images      : list()
  ..@ project.name: chr "PBMC analysis"
  ..@ misc        : list()
  ..@ version     :Classes 'package_version', 'numeric_version'  hidden list of 1
  .. ..$ : int [1:3] 5 0 1
  ..@ commands    :List of 4
  .. ..$ NormalizeData.RNA       :Formal class 'SeuratCommand' [package "SeuratObject"] with 5 slots
  .. .. .. ..@ name       : chr "NormalizeData.RNA"
  .. .. .. ..@ time.stamp : POSIXct[1:1], format: "2024-05-28 10:53:56"
  .. .. .. ..@ assay.used : chr "RNA"
  .. .. .. ..@ call.string: chr [1:2] "NormalizeData(pbmc_small, assay = \"RNA\", normalization.method = \"LogNormalize\", " "    scale.factor = median(pbmc_small$nCount_RNA), verbose = TRUE)"
  .. .. .. ..@ params     :List of 5
  .. .. .. .. ..$ assay               : chr "RNA"
  .. .. .. .. ..$ normalization.method: chr "LogNormalize"
  .. .. .. .. ..$ scale.factor        : num 2197
  .. .. .. .. ..$ margin              : num 1
  .. .. .. .. ..$ verbose             : logi TRUE
  .. ..$ FindVariableFeatures.RNA:Formal class 'SeuratCommand' [package "SeuratObject"] with 5 slots
  .. .. .. ..@ name       : chr "FindVariableFeatures.RNA"
  .. .. .. ..@ time.stamp : POSIXct[1:1], format: "2024-05-28 10:53:58"
  .. .. .. ..@ assay.used : chr "RNA"
  .. .. .. ..@ call.string: chr [1:2] "FindVariableFeatures(pbmc_small, selection.method = \"vst\", " "    nfeatures = 2000)"
  .. .. .. ..@ params     :List of 12
  .. .. .. .. ..$ assay              : chr "RNA"
  .. .. .. .. ..$ selection.method   : chr "vst"
  .. .. .. .. ..$ loess.span         : num 0.3
  .. .. .. .. ..$ clip.max           : chr "auto"
  .. .. .. .. ..$ mean.function      :function (mat, display_progress)  
  .. .. .. .. ..$ dispersion.function:function (mat, display_progress)  
  .. .. .. .. ..$ num.bin            : num 20
  .. .. .. .. ..$ binning.method     : chr "equal_width"
  .. .. .. .. ..$ nfeatures          : num 2000
  .. .. .. .. ..$ mean.cutoff        : num [1:2] 0.1 8
  .. .. .. .. ..$ dispersion.cutoff  : num [1:2] 1 Inf
  .. .. .. .. ..$ verbose            : logi TRUE
  .. ..$ ScaleData.RNA           :Formal class 'SeuratCommand' [package "SeuratObject"] with 5 slots
  .. .. .. ..@ name       : chr "ScaleData.RNA"
  .. .. .. ..@ time.stamp : POSIXct[1:1], format: "2024-05-28 10:56:22"
  .. .. .. ..@ assay.used : chr "RNA"
  .. .. .. ..@ call.string: chr "ScaleData(pbmc_small)"
  .. .. .. ..@ params     :List of 9
  .. .. .. .. ..$ assay             : chr "RNA"
  .. .. .. .. ..$ model.use         : chr "linear"
  .. .. .. .. ..$ use.umi           : logi FALSE
  .. .. .. .. ..$ do.scale          : logi TRUE
  .. .. .. .. ..$ do.center         : logi TRUE
  .. .. .. .. ..$ scale.max         : num 10
  .. .. .. .. ..$ block.size        : num 1000
  .. .. .. .. ..$ min.cells.to.block: num 3000
  .. .. .. .. ..$ verbose           : logi TRUE
  .. ..$ RunPCA.RNA              :Formal class 'SeuratCommand' [package "SeuratObject"] with 5 slots
  .. .. .. ..@ name       : chr "RunPCA.RNA"
  .. .. .. ..@ time.stamp : POSIXct[1:1], format: "2024-05-28 10:56:24"
  .. .. .. ..@ assay.used : chr "RNA"
  .. .. .. ..@ call.string: chr [1:2] "RunPCA(pbmc_small, reduction.name = \"pca\", npcs = 50, " "    seed.use = 42, verbose = FALSE)"
  .. .. .. ..@ params     :List of 10
  .. .. .. .. ..$ assay          : chr "RNA"
  .. .. .. .. ..$ npcs           : num 50
  .. .. .. .. ..$ rev.pca        : logi FALSE
  .. .. .. .. ..$ weight.by.var  : logi TRUE
  .. .. .. .. ..$ verbose        : logi FALSE
  .. .. .. .. ..$ ndims.print    : int [1:5] 1 2 3 4 5
  .. .. .. .. ..$ nfeatures.print: num 30
  .. .. .. .. ..$ reduction.name : chr "pca"
  .. .. .. .. ..$ reduction.key  : chr "PC_"
  .. .. .. .. ..$ seed.use       : num 42
  ..@ tools       : list()

Here is my sessionInfo :

> sessionInfo()
R version 4.3.1 (2023-06-16)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Ubuntu 20.04.6 LTS

Matrix products: default
BLAS/LAPACK: /opt/rstudio-server_conda/conda/envs/rstudio-server_4.3.1/lib/libopenblasp-r0.3.24.so;  LAPACK version 3.11.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: Europe/Paris
tzcode source: system (glibc)

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] vroom_1.6.5            msigdbr_7.5.1          rmarkdown_2.26         knitr_1.46            
 [5] enrichplot_1.22.0      org.Hs.eg.db_3.18.0    AnnotationDbi_1.64.1   IRanges_2.36.0        
 [9] S4Vectors_0.40.2       Biobase_2.62.0         BiocGenerics_0.48.1    clusterProfiler_4.10.0
[13] clustree_0.5.1         magrittr_2.0.3         dplyr_1.1.4            plyr_1.8.9            
[17] RColorBrewer_1.1-3     gridExtra_2.3          ggraph_2.1.0           ggplot2_3.5.1         
[21] lava_1.8.0             Seurat_5.0.3           SeuratObject_5.0.1     sp_2.1-3              
[25] biomaRt_2.58.0        

loaded via a namespace (and not attached):
  [1] RcppAnnoy_0.0.22        splines_4.3.1           later_1.3.2             ggplotify_0.1.2        
  [5] bitops_1.0-7            filelock_1.0.3          R.oo_1.26.0             tibble_3.2.1           
  [9] polyclip_1.10-6         XML_3.99-0.16.1         fastDummies_1.7.3       lifecycle_1.0.4        
 [13] globals_0.16.3          lattice_0.22-6          MASS_7.3-60.0.1         plotly_4.10.4          
 [17] httpuv_1.6.15           sctransform_0.4.1       spam_2.10-0             spatstat.sparse_3.0-3  
 [21] reticulate_1.35.0       cowplot_1.1.3           pbapply_1.7-2           DBI_1.2.2              
 [25] abind_1.4-5             zlibbioc_1.48.0         Rtsne_0.17              R.utils_2.12.3         
 [29] purrr_1.0.2             RCurl_1.98-1.14         yulab.utils_0.1.4       tweenr_2.0.2           
 [33] rappdirs_0.3.3          GenomeInfoDbData_1.2.11 ggrepel_0.9.5           irlba_2.3.5.1          
 [37] listenv_0.9.1           spatstat.utils_3.0-4    tidytree_0.4.6          goftest_1.2-3          
 [41] RSpectra_0.16-1         spatstat.random_3.2-3   fitdistrplus_1.1-11     parallelly_1.37.1      
 [45] leiden_0.4.3.1          codetools_0.2-20        DOSE_3.28.2             xml2_1.3.6             
 [49] ggforce_0.4.1           tidyselect_1.2.1        aplot_0.2.2             farver_2.1.1           
 [53] viridis_0.6.5           matrixStats_1.3.0       BiocFileCache_2.10.1    spatstat.explore_3.2-7 
 [57] jsonlite_1.8.8          tidygraph_1.2.3         progressr_0.14.0        ggridges_0.5.6         
 [61] survival_3.5-8          tools_4.3.1             progress_1.2.3          treeio_1.26.0          
 [65] ica_1.0-3               Rcpp_1.0.12             glue_1.7.0              xfun_0.43              
 [69] qvalue_2.34.0           GenomeInfoDb_1.38.5     withr_3.0.0             fastmap_1.1.1          
 [73] fansi_1.0.6             digest_0.6.35           gridGraphics_0.5-1      R6_2.5.1               
 [77] mime_0.12               colorspace_2.1-0        GO.db_3.18.0            scattermore_1.2        
 [81] tensor_1.5              spatstat.data_3.0-4     RSQLite_2.3.6           R.methodsS3_1.8.2      
 [85] utf8_1.2.4              tidyr_1.3.1             generics_0.1.3          data.table_1.15.4      
 [89] prettyunits_1.2.0       graphlayouts_1.0.1      httr_1.4.7              htmlwidgets_1.6.4      
 [93] scatterpie_0.2.2        uwot_0.2.2              pkgconfig_2.0.3         gtable_0.3.5           
 [97] blob_1.2.4              lmtest_0.9-40           XVector_0.42.0          shadowtext_0.1.3       
[101] htmltools_0.5.8.1       fgsea_1.28.0            dotCall64_1.1-1         scales_1.3.0           
[105] png_0.1-8               ggfun_0.1.4             rstudioapi_0.16.0       tzdb_0.4.0             
[109] reshape2_1.4.4          nlme_3.1-164            curl_5.0.2              cachem_1.0.8           
[113] zoo_1.8-12              stringr_1.5.1           KernSmooth_2.23-22      HDO.db_0.99.1          
[117] parallel_4.3.1          miniUI_0.1.1.1          pillar_1.9.0            grid_4.3.1             
[121] vctrs_0.6.5             RANN_2.6.1              promises_1.3.0          dbplyr_2.5.0           
[125] xtable_1.8-4            cluster_2.1.6           evaluate_0.23           cli_3.6.2              
[129] compiler_4.3.1          rlang_1.1.3             crayon_1.5.2            future.apply_1.11.2    
[133] labeling_0.4.3          fs_1.6.4                stringi_1.8.3           viridisLite_0.4.2      
[137] deldir_2.0-4            BiocParallel_1.36.0     babelgene_22.9          munsell_0.5.1          
[141] Biostrings_2.70.1       lazyeval_0.2.2          spatstat.geom_3.2-9     GOSemSim_2.28.1        
[145] Matrix_1.6-5            RcppHNSW_0.6.0          hms_1.1.3               patchwork_1.2.0        
[149] bit64_4.0.5             future_1.33.2           KEGGREST_1.42.0         shiny_1.8.1.1          
[153] ROCR_1.0-11             igraph_2.0.3            memoise_2.0.1           ggtree_3.10.0          
[157] fastmatch_1.1-4         bit_4.0.5               gson_0.1.0              ape_5.8  

Can you help me with this ? Thank you for your time Lea

samuel-marsh commented 3 months ago

Hi Lea,

Not member of dev team but hopefully can be helpful. I'm unable to replicate your issue so I'm wondering if there is something else different between your two runs here. According to the commands log at least the ScaleData commands are different between the two runs you show. Here is reprex showing my attempt to replicate issue.

Best, Sam

library(tidyverse)
library(Seurat)
#> Loading required package: SeuratObject
#> Loading required package: sp
#> 
#> Attaching package: 'SeuratObject'
#> The following objects are masked from 'package:base':
#> 
#>     intersect, t

# Load Data
pbmc <- pbmc3k.SeuratData::pbmc3k
pbmc <- UpdateSeuratObject(pbmc)
#> Validating object structure
#> Updating object slots
#> Ensuring keys are in the proper structure
#> Warning: Assay RNA changing from Assay to Assay
#> Ensuring keys are in the proper structure
#> Ensuring feature names don't have underscores or pipes
#> Updating slots in RNA
#> Validating object structure for Assay 'RNA'
#> Object representation is consistent with the most current Seurat version

# Process until PCA
pbmc <- NormalizeData(pbmc)
pbmc <- FindVariableFeatures(pbmc)
pbmc <- ScaleData(pbmc)
#> Centering and scaling data matrix

# Run PCA two ways
pbmc_var <- RunPCA(object = pbmc, features = VariableFeatures(pbmc), seed.use = 42)
#> PC_ 1 
#> Positive:  MALAT1, LTB, IL32, CD2, ACAP1, STK17A, CTSW, CD247, CCL5, GIMAP5 
#>     AQP3, GZMA, CST7, TRAF3IP3, MAL, HOPX, ITM2A, GZMK, MYC, BEX2 
#>     GIMAP7, ETS1, LDLRAP1, ZAP70, LYAR, RIC3, TNFAIP8, KLRG1, SAMD3, NKG7 
#> Negative:  CST3, TYROBP, LST1, AIF1, FTL, FCN1, LYZ, FTH1, S100A9, FCER1G 
#>     TYMP, CFD, LGALS1, CTSS, S100A8, SERPINA1, LGALS2, SPI1, IFITM3, PSAP 
#>     CFP, SAT1, IFI30, COTL1, S100A11, NPC2, LGALS3, GSTP1, PYCARD, NCF2 
#> PC_ 2 
#> Positive:  CD79A, MS4A1, TCL1A, HLA-DQA1, HLA-DRA, HLA-DQB1, LINC00926, CD79B, HLA-DRB1, CD74 
#>     HLA-DPB1, HLA-DMA, HLA-DQA2, HLA-DRB5, HLA-DPA1, HLA-DMB, FCRLA, HVCN1, LTB, BLNK 
#>     KIAA0125, P2RX5, IRF8, IGLL5, SWAP70, ARHGAP24, SMIM14, PPP1R14A, FCRL2, C16orf74 
#> Negative:  NKG7, PRF1, CST7, GZMA, GZMB, FGFBP2, CTSW, GNLY, GZMH, SPON2 
#>     CCL4, FCGR3A, CCL5, CD247, XCL2, CLIC3, AKR1C3, SRGN, HOPX, CTSC 
#>     TTC38, S100A4, ANXA1, IL32, IGFBP7, ID2, ACTB, XCL1, APOBEC3G, SAMD3 
#> PC_ 3 
#> Positive:  HLA-DQA1, CD79A, CD79B, HLA-DQB1, HLA-DPB1, CD74, HLA-DPA1, MS4A1, HLA-DRB1, HLA-DRB5 
#>     HLA-DRA, HLA-DQA2, TCL1A, LINC00926, HLA-DMB, HLA-DMA, HVCN1, FCRLA, IRF8, BLNK 
#>     KIAA0125, SMIM14, PLD4, IGLL5, P2RX5, TMSB10, SWAP70, LAT2, MALAT1, IGJ 
#> Negative:  PPBP, PF4, SDPR, SPARC, GNG11, NRGN, GP9, RGS18, TUBB1, CLU 
#>     HIST1H2AC, AP001189.4, ITGA2B, CD9, TMEM40, CA2, PTCRA, ACRBP, MMD, TREML1 
#>     NGFRAP1, F13A1, RUFY1, SEPT5, MPP1, CMTM5, TSC22D1, MYL9, RP11-367G6.3, GP1BA 
#> PC_ 4 
#> Positive:  HLA-DQA1, CD79A, CD79B, HIST1H2AC, HLA-DQB1, PF4, MS4A1, SDPR, CD74, PPBP 
#>     HLA-DPB1, GNG11, HLA-DQA2, SPARC, HLA-DRB1, HLA-DPA1, GP9, TCL1A, HLA-DRA, LINC00926 
#>     NRGN, RGS18, HLA-DRB5, PTCRA, CD9, AP001189.4, CA2, CLU, TUBB1, ITGA2B 
#> Negative:  VIM, S100A8, S100A6, S100A4, S100A9, TMSB10, IL32, GIMAP7, LGALS2, S100A10 
#>     RBP7, FCN1, MAL, LYZ, S100A12, MS4A6A, CD2, FYB, S100A11, FOLR3 
#>     GIMAP4, AQP3, ANXA1, AIF1, MALAT1, GIMAP5, IL8, IFI6, TRABD2A, TMSB4X 
#> PC_ 5 
#> Positive:  GZMB, FGFBP2, NKG7, GNLY, PRF1, CCL4, CST7, SPON2, GZMA, CLIC3 
#>     GZMH, XCL2, CTSW, TTC38, AKR1C3, CCL5, IGFBP7, XCL1, CCL3, S100A8 
#>     TYROBP, HOPX, CD160, HAVCR2, S100A9, FCER1G, PTGDR, LGALS2, RBP7, S100A12 
#> Negative:  LTB, VIM, AQP3, PPA1, MAL, KIAA0101, CD2, CYTIP, CORO1B, FYB 
#>     IL32, TRADD, ANXA5, TUBA1B, HN1, TYMS, PTGES3, ITM2A, COTL1, GPR183 
#>     TNFAIP8, ACTG1, TRAF3IP3, ATP5C1, GIMAP4, ZWINT, PRDX1, LDLRAP1, ABRACL, NGFRAP1
pbmc_default <- RunPCA(object = pbmc, seed.use = 42)
#> PC_ 1 
#> Positive:  MALAT1, LTB, IL32, CD2, ACAP1, STK17A, CTSW, CD247, CCL5, GIMAP5 
#>     AQP3, GZMA, CST7, TRAF3IP3, MAL, HOPX, ITM2A, GZMK, MYC, BEX2 
#>     GIMAP7, ETS1, LDLRAP1, ZAP70, LYAR, RIC3, TNFAIP8, KLRG1, SAMD3, NKG7 
#> Negative:  CST3, TYROBP, LST1, AIF1, FTL, FCN1, LYZ, FTH1, S100A9, FCER1G 
#>     TYMP, CFD, LGALS1, CTSS, S100A8, SERPINA1, LGALS2, SPI1, IFITM3, PSAP 
#>     CFP, SAT1, IFI30, COTL1, S100A11, NPC2, LGALS3, GSTP1, PYCARD, NCF2 
#> PC_ 2 
#> Positive:  CD79A, MS4A1, TCL1A, HLA-DQA1, HLA-DRA, HLA-DQB1, LINC00926, CD79B, HLA-DRB1, CD74 
#>     HLA-DPB1, HLA-DMA, HLA-DQA2, HLA-DRB5, HLA-DPA1, HLA-DMB, FCRLA, HVCN1, LTB, BLNK 
#>     KIAA0125, P2RX5, IRF8, IGLL5, SWAP70, ARHGAP24, SMIM14, PPP1R14A, FCRL2, C16orf74 
#> Negative:  NKG7, PRF1, CST7, GZMA, GZMB, FGFBP2, CTSW, GNLY, GZMH, SPON2 
#>     CCL4, FCGR3A, CCL5, CD247, XCL2, CLIC3, AKR1C3, SRGN, HOPX, CTSC 
#>     TTC38, S100A4, ANXA1, IL32, IGFBP7, ID2, ACTB, XCL1, APOBEC3G, SAMD3 
#> PC_ 3 
#> Positive:  HLA-DQA1, CD79A, CD79B, HLA-DQB1, HLA-DPB1, CD74, HLA-DPA1, MS4A1, HLA-DRB1, HLA-DRB5 
#>     HLA-DRA, HLA-DQA2, TCL1A, LINC00926, HLA-DMB, HLA-DMA, HVCN1, FCRLA, IRF8, BLNK 
#>     KIAA0125, SMIM14, PLD4, IGLL5, P2RX5, TMSB10, SWAP70, LAT2, MALAT1, IGJ 
#> Negative:  PPBP, PF4, SDPR, SPARC, GNG11, NRGN, GP9, RGS18, TUBB1, CLU 
#>     HIST1H2AC, AP001189.4, ITGA2B, CD9, TMEM40, CA2, PTCRA, ACRBP, MMD, TREML1 
#>     NGFRAP1, F13A1, RUFY1, SEPT5, MPP1, CMTM5, TSC22D1, MYL9, RP11-367G6.3, GP1BA 
#> PC_ 4 
#> Positive:  HLA-DQA1, CD79A, CD79B, HIST1H2AC, HLA-DQB1, PF4, MS4A1, SDPR, CD74, PPBP 
#>     HLA-DPB1, GNG11, HLA-DQA2, SPARC, HLA-DRB1, HLA-DPA1, GP9, TCL1A, HLA-DRA, LINC00926 
#>     NRGN, RGS18, HLA-DRB5, PTCRA, CD9, AP001189.4, CA2, CLU, TUBB1, ITGA2B 
#> Negative:  VIM, S100A8, S100A6, S100A4, S100A9, TMSB10, IL32, GIMAP7, LGALS2, S100A10 
#>     RBP7, FCN1, MAL, LYZ, S100A12, MS4A6A, CD2, FYB, S100A11, FOLR3 
#>     GIMAP4, AQP3, ANXA1, AIF1, MALAT1, GIMAP5, IL8, IFI6, TRABD2A, TMSB4X 
#> PC_ 5 
#> Positive:  GZMB, FGFBP2, NKG7, GNLY, PRF1, CCL4, CST7, SPON2, GZMA, CLIC3 
#>     GZMH, XCL2, CTSW, TTC38, AKR1C3, CCL5, IGFBP7, XCL1, CCL3, S100A8 
#>     TYROBP, HOPX, CD160, HAVCR2, S100A9, FCER1G, PTGDR, LGALS2, RBP7, S100A12 
#> Negative:  LTB, VIM, AQP3, PPA1, MAL, KIAA0101, CD2, CYTIP, CORO1B, FYB 
#>     IL32, TRADD, ANXA5, TUBA1B, HN1, TYMS, PTGES3, ITM2A, COTL1, GPR183 
#>     TNFAIP8, ACTG1, TRAF3IP3, ATP5C1, GIMAP4, ZWINT, PRDX1, LDLRAP1, ABRACL, NGFRAP1

# Check Results
dim(Loadings(pbmc_var, reduction = "pca"))
#> [1] 2000   50
dim(Loadings(pbmc_default, reduction = "pca"))
#> [1] 2000   50

identical(Loadings(pbmc_var, reduction = "pca"), Loadings(pbmc_default, reduction = "pca"))
#> [1] TRUE

Created on 2024-05-28 with reprex v2.1.0

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.4.0 (2024-04-24) #> os macOS Monterey 12.7.5 #> system x86_64, darwin20 #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz America/New_York #> date 2024-05-28 #> pandoc 3.1.11 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/x86_64/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> abind 1.4-5 2016-07-21 [1] CRAN (R 4.4.0) #> cli 3.6.2 2023-12-11 [1] CRAN (R 4.4.0) #> cluster 2.1.6 2023-12-01 [1] CRAN (R 4.4.0) #> codetools 0.2-20 2024-03-31 [1] CRAN (R 4.4.0) #> colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.4.0) #> cowplot 1.1.3 2024-01-22 [1] CRAN (R 4.4.0) #> data.table 1.15.4 2024-03-30 [1] CRAN (R 4.4.0) #> deldir 2.0-4 2024-02-28 [1] CRAN (R 4.4.0) #> digest 0.6.35 2024-03-11 [1] CRAN (R 4.4.0) #> dotCall64 1.1-1 2023-11-28 [1] CRAN (R 4.4.0) #> dplyr * 1.1.4 2023-11-17 [1] CRAN (R 4.4.0) #> evaluate 0.23 2023-11-01 [1] CRAN (R 4.4.0) #> fansi 1.0.6 2023-12-08 [1] CRAN (R 4.4.0) #> fastDummies 1.7.3 2023-07-06 [1] CRAN (R 4.4.0) #> fastmap 1.2.0 2024-05-15 [1] CRAN (R 4.4.0) #> fitdistrplus 1.1-11 2023-04-25 [1] CRAN (R 4.4.0) #> forcats * 1.0.0 2023-01-29 [1] CRAN (R 4.4.0) #> fs 1.6.4 2024-04-25 [1] CRAN (R 4.4.0) #> future 1.33.2 2024-03-26 [1] CRAN (R 4.4.0) #> future.apply 1.11.2 2024-03-28 [1] CRAN (R 4.4.0) #> generics 0.1.3 2022-07-05 [1] CRAN (R 4.4.0) #> ggplot2 * 3.5.1 2024-04-23 [1] CRAN (R 4.4.0) #> ggrepel 0.9.5 2024-01-10 [1] CRAN (R 4.4.0) #> ggridges 0.5.6 2024-01-23 [1] CRAN (R 4.4.0) #> globals 0.16.3 2024-03-08 [1] CRAN (R 4.4.0) #> glue 1.7.0 2024-01-09 [1] CRAN (R 4.4.0) #> goftest 1.2-3 2021-10-07 [1] CRAN (R 4.4.0) #> gridExtra 2.3 2017-09-09 [1] CRAN (R 4.4.0) #> gtable 0.3.5 2024-04-22 [1] CRAN (R 4.4.0) #> hms 1.1.3 2023-03-21 [1] CRAN (R 4.4.0) #> htmltools 0.5.8.1 2024-04-04 [1] CRAN (R 4.4.0) #> htmlwidgets 1.6.4 2023-12-06 [1] CRAN (R 4.4.0) #> httpuv 1.6.15 2024-03-26 [1] CRAN (R 4.4.0) #> httr 1.4.7 2023-08-15 [1] CRAN (R 4.4.0) #> ica 1.0-3 2022-07-08 [1] CRAN (R 4.4.0) #> igraph 2.0.3 2024-03-13 [1] CRAN (R 4.4.0) #> irlba 2.3.5.1 2022-10-03 [1] CRAN (R 4.4.0) #> jsonlite 1.8.8 2023-12-04 [1] CRAN (R 4.4.0) #> KernSmooth 2.23-24 2024-05-17 [1] CRAN (R 4.4.0) #> knitr 1.46 2024-04-06 [1] CRAN (R 4.4.0) #> later 1.3.2 2023-12-06 [1] CRAN (R 4.4.0) #> lattice 0.22-6 2024-03-20 [1] CRAN (R 4.4.0) #> lazyeval 0.2.2 2019-03-15 [1] CRAN (R 4.4.0) #> leiden 0.4.3.1 2023-11-17 [1] CRAN (R 4.4.0) #> lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.4.0) #> listenv 0.9.1 2024-01-29 [1] CRAN (R 4.4.0) #> lmtest 0.9-40 2022-03-21 [1] CRAN (R 4.4.0) #> lubridate * 1.9.3 2023-09-27 [1] CRAN (R 4.4.0) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.4.0) #> MASS 7.3-60.2 2024-04-24 [1] local #> Matrix 1.7-0 2024-03-22 [1] CRAN (R 4.4.0) #> matrixStats 1.3.0 2024-04-11 [1] CRAN (R 4.4.0) #> mime 0.12 2021-09-28 [1] CRAN (R 4.4.0) #> miniUI 0.1.1.1 2018-05-18 [1] CRAN (R 4.4.0) #> munsell 0.5.1 2024-04-01 [1] CRAN (R 4.4.0) #> nlme 3.1-164 2023-11-27 [1] CRAN (R 4.4.0) #> parallelly 1.37.1 2024-02-29 [1] CRAN (R 4.4.0) #> patchwork 1.2.0 2024-01-08 [1] CRAN (R 4.4.0) #> pbapply 1.7-2 2023-06-27 [1] CRAN (R 4.4.0) #> pbmc3k.SeuratData 3.1.4 2024-05-01 [1] local #> pillar 1.9.0 2023-03-22 [1] CRAN (R 4.4.0) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.4.0) #> plotly 4.10.4 2024-01-13 [1] CRAN (R 4.4.0) #> plyr 1.8.9 2023-10-02 [1] CRAN (R 4.4.0) #> png 0.1-8 2022-11-29 [1] CRAN (R 4.4.0) #> polyclip 1.10-6 2023-09-27 [1] CRAN (R 4.4.0) #> progressr 0.14.0 2023-08-10 [1] CRAN (R 4.4.0) #> promises 1.3.0 2024-04-05 [1] CRAN (R 4.4.0) #> purrr * 1.0.2 2023-08-10 [1] CRAN (R 4.4.0) #> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.4.0) #> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.4.0) #> R.oo 1.26.0 2024-01-24 [1] CRAN (R 4.4.0) #> R.utils 2.12.3 2023-11-18 [1] CRAN (R 4.4.0) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.4.0) #> RANN 2.6.1 2019-01-08 [1] CRAN (R 4.4.0) #> RColorBrewer 1.1-3 2022-04-03 [1] CRAN (R 4.4.0) #> Rcpp 1.0.12 2024-01-09 [1] CRAN (R 4.4.0) #> RcppAnnoy 0.0.22 2024-01-23 [1] CRAN (R 4.4.0) #> RcppHNSW 0.6.0 2024-02-04 [1] CRAN (R 4.4.0) #> readr * 2.1.5 2024-01-10 [1] CRAN (R 4.4.0) #> reprex 2.1.0 2024-01-11 [1] CRAN (R 4.4.0) #> reshape2 1.4.4 2020-04-09 [1] CRAN (R 4.4.0) #> reticulate 1.37.0 2024-05-21 [1] CRAN (R 4.4.0) #> rlang 1.1.3 2024-01-10 [1] CRAN (R 4.4.0) #> rmarkdown 2.27 2024-05-17 [1] CRAN (R 4.4.0) #> ROCR 1.0-11 2020-05-02 [1] CRAN (R 4.4.0) #> RSpectra 0.16-1 2022-04-24 [1] CRAN (R 4.4.0) #> rstudioapi 0.16.0 2024-03-24 [1] CRAN (R 4.4.0) #> Rtsne 0.17 2023-12-07 [1] CRAN (R 4.4.0) #> scales 1.3.0 2023-11-28 [1] CRAN (R 4.4.0) #> scattermore 1.2 2023-06-12 [1] CRAN (R 4.4.0) #> sctransform 0.4.1 2023-10-19 [1] CRAN (R 4.4.0) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.4.0) #> Seurat * 5.1.0 2024-05-10 [1] CRAN (R 4.4.0) #> SeuratObject * 5.0.2 2024-05-08 [1] CRAN (R 4.4.0) #> shiny 1.8.1.1 2024-04-02 [1] CRAN (R 4.4.0) #> sp * 2.1-4 2024-04-30 [1] CRAN (R 4.4.0) #> spam 2.10-0 2023-10-23 [1] CRAN (R 4.4.0) #> spatstat.data 3.0-4 2024-01-15 [1] CRAN (R 4.4.0) #> spatstat.explore 3.2-7 2024-03-21 [1] CRAN (R 4.4.0) #> spatstat.geom 3.2-9 2024-02-28 [1] CRAN (R 4.4.0) #> spatstat.random 3.2-3 2024-02-29 [1] CRAN (R 4.4.0) #> spatstat.sparse 3.0-3 2023-10-24 [1] CRAN (R 4.4.0) #> spatstat.utils 3.0-4 2023-10-24 [1] CRAN (R 4.4.0) #> stringi 1.8.4 2024-05-06 [1] CRAN (R 4.4.0) #> stringr * 1.5.1 2023-11-14 [1] CRAN (R 4.4.0) #> styler 1.10.3 2024-04-07 [1] CRAN (R 4.4.0) #> survival 3.6-4 2024-04-24 [1] CRAN (R 4.4.0) #> tensor 1.5 2012-05-05 [1] CRAN (R 4.4.0) #> tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.4.0) #> tidyr * 1.3.1 2024-01-24 [1] CRAN (R 4.4.0) #> tidyselect 1.2.1 2024-03-11 [1] CRAN (R 4.4.0) #> tidyverse * 2.0.0 2023-02-22 [1] CRAN (R 4.4.0) #> timechange 0.3.0 2024-01-18 [1] CRAN (R 4.4.0) #> tzdb 0.4.0 2023-05-12 [1] CRAN (R 4.4.0) #> utf8 1.2.4 2023-10-22 [1] CRAN (R 4.4.0) #> uwot 0.2.2 2024-04-21 [1] CRAN (R 4.4.0) #> vctrs 0.6.5 2023-12-01 [1] CRAN (R 4.4.0) #> viridisLite 0.4.2 2023-05-02 [1] CRAN (R 4.4.0) #> withr 3.0.0 2024-01-16 [1] CRAN (R 4.4.0) #> xfun 0.44 2024-05-15 [1] CRAN (R 4.4.0) #> xtable 1.8-4 2019-04-21 [1] CRAN (R 4.4.0) #> yaml 2.3.8 2023-12-11 [1] CRAN (R 4.4.0) #> zoo 1.8-12 2023-04-13 [1] CRAN (R 4.4.0) #> #> [1] /Library/Frameworks/R.framework/Versions/4.4-x86_64/Resources/library #> #> ────────────────────────────────────────────────────────────────────────────── ```
bellenger-l commented 3 months ago

Thanks a lot @samuel-marsh for your test ! In fact, the problem I'm encountering appears only when you specify that you want to scale all genes and not just the most variable ones. If you specify in your test :

pbmc <- ScaleData(pbmc, features = rownames(features))
pbmc_var <- RunPCA(object = pbmc, features = VariableFeatures(pbmc), seed.use = 42)

You'll see the same issue as me, otherwise it's a difference between our packages versions

Best Lea

samuel-marsh commented 3 months ago

Hi Lea,

I would suggest updating your versions of SeuratObject and Seurat and trying to run again. When I run it is still identical regardless of the scaling as well.

Best, Sam

New reprex:

library(tidyverse)
library(Seurat)
#> Loading required package: SeuratObject
#> Loading required package: sp
#> 
#> Attaching package: 'SeuratObject'
#> The following objects are masked from 'package:base':
#> 
#>     intersect, t

# Load Data
pbmc <- pbmc3k.SeuratData::pbmc3k
pbmc <- UpdateSeuratObject(pbmc)
#> Validating object structure
#> Updating object slots
#> Ensuring keys are in the proper structure
#> Warning: Assay RNA changing from Assay to Assay
#> Ensuring keys are in the proper structure
#> Ensuring feature names don't have underscores or pipes
#> Updating slots in RNA
#> Validating object structure for Assay 'RNA'
#> Object representation is consistent with the most current Seurat version

# Process until PCA
pbmc <- NormalizeData(pbmc)
pbmc <- FindVariableFeatures(pbmc)
pbmc_all <- ScaleData(pbmc, features = Features(pbmc))
#> Centering and scaling data matrix
pbmc <- ScaleData(pbmc)
#> Centering and scaling data matrix

# Run PCA two ways
pbmc_var <- RunPCA(object = pbmc, features = VariableFeatures(pbmc), seed.use = 42)
#> PC_ 1 
#> Positive:  MALAT1, LTB, IL32, CD2, ACAP1, STK17A, CTSW, CD247, CCL5, GIMAP5 
#>     AQP3, GZMA, CST7, TRAF3IP3, MAL, HOPX, ITM2A, GZMK, MYC, BEX2 
#>     GIMAP7, ETS1, LDLRAP1, ZAP70, LYAR, RIC3, TNFAIP8, KLRG1, SAMD3, NKG7 
#> Negative:  CST3, TYROBP, LST1, AIF1, FTL, FCN1, LYZ, FTH1, S100A9, FCER1G 
#>     TYMP, CFD, LGALS1, CTSS, S100A8, SERPINA1, LGALS2, SPI1, IFITM3, PSAP 
#>     CFP, SAT1, IFI30, COTL1, S100A11, NPC2, LGALS3, GSTP1, PYCARD, NCF2 
#> PC_ 2 
#> Positive:  CD79A, MS4A1, TCL1A, HLA-DQA1, HLA-DRA, HLA-DQB1, LINC00926, CD79B, HLA-DRB1, CD74 
#>     HLA-DPB1, HLA-DMA, HLA-DQA2, HLA-DRB5, HLA-DPA1, HLA-DMB, FCRLA, HVCN1, LTB, BLNK 
#>     KIAA0125, P2RX5, IRF8, IGLL5, SWAP70, ARHGAP24, SMIM14, PPP1R14A, FCRL2, C16orf74 
#> Negative:  NKG7, PRF1, CST7, GZMA, GZMB, FGFBP2, CTSW, GNLY, GZMH, SPON2 
#>     CCL4, FCGR3A, CCL5, CD247, XCL2, CLIC3, AKR1C3, SRGN, HOPX, CTSC 
#>     TTC38, S100A4, ANXA1, IL32, IGFBP7, ID2, ACTB, XCL1, APOBEC3G, SAMD3 
#> PC_ 3 
#> Positive:  HLA-DQA1, CD79A, CD79B, HLA-DQB1, HLA-DPB1, CD74, HLA-DPA1, MS4A1, HLA-DRB1, HLA-DRB5 
#>     HLA-DRA, HLA-DQA2, TCL1A, LINC00926, HLA-DMB, HLA-DMA, HVCN1, FCRLA, IRF8, BLNK 
#>     KIAA0125, SMIM14, PLD4, IGLL5, P2RX5, TMSB10, SWAP70, LAT2, MALAT1, IGJ 
#> Negative:  PPBP, PF4, SDPR, SPARC, GNG11, NRGN, GP9, RGS18, TUBB1, CLU 
#>     HIST1H2AC, AP001189.4, ITGA2B, CD9, TMEM40, CA2, PTCRA, ACRBP, MMD, TREML1 
#>     NGFRAP1, F13A1, RUFY1, SEPT5, MPP1, CMTM5, TSC22D1, MYL9, RP11-367G6.3, GP1BA 
#> PC_ 4 
#> Positive:  HLA-DQA1, CD79A, CD79B, HIST1H2AC, HLA-DQB1, PF4, MS4A1, SDPR, CD74, PPBP 
#>     HLA-DPB1, GNG11, HLA-DQA2, SPARC, HLA-DRB1, HLA-DPA1, GP9, TCL1A, HLA-DRA, LINC00926 
#>     NRGN, RGS18, HLA-DRB5, PTCRA, CD9, AP001189.4, CA2, CLU, TUBB1, ITGA2B 
#> Negative:  VIM, S100A8, S100A6, S100A4, S100A9, TMSB10, IL32, GIMAP7, LGALS2, S100A10 
#>     RBP7, FCN1, MAL, LYZ, S100A12, MS4A6A, CD2, FYB, S100A11, FOLR3 
#>     GIMAP4, AQP3, ANXA1, AIF1, MALAT1, GIMAP5, IL8, IFI6, TRABD2A, TMSB4X 
#> PC_ 5 
#> Positive:  GZMB, FGFBP2, NKG7, GNLY, PRF1, CCL4, CST7, SPON2, GZMA, CLIC3 
#>     GZMH, XCL2, CTSW, TTC38, AKR1C3, CCL5, IGFBP7, XCL1, CCL3, S100A8 
#>     TYROBP, HOPX, CD160, HAVCR2, S100A9, FCER1G, PTGDR, LGALS2, RBP7, S100A12 
#> Negative:  LTB, VIM, AQP3, PPA1, MAL, KIAA0101, CD2, CYTIP, CORO1B, FYB 
#>     IL32, TRADD, ANXA5, TUBA1B, HN1, TYMS, PTGES3, ITM2A, COTL1, GPR183 
#>     TNFAIP8, ACTG1, TRAF3IP3, ATP5C1, GIMAP4, ZWINT, PRDX1, LDLRAP1, ABRACL, NGFRAP1
pbmc_default <- RunPCA(object = pbmc, seed.use = 42)
#> PC_ 1 
#> Positive:  MALAT1, LTB, IL32, CD2, ACAP1, STK17A, CTSW, CD247, CCL5, GIMAP5 
#>     AQP3, GZMA, CST7, TRAF3IP3, MAL, HOPX, ITM2A, GZMK, MYC, BEX2 
#>     GIMAP7, ETS1, LDLRAP1, ZAP70, LYAR, RIC3, TNFAIP8, KLRG1, SAMD3, NKG7 
#> Negative:  CST3, TYROBP, LST1, AIF1, FTL, FCN1, LYZ, FTH1, S100A9, FCER1G 
#>     TYMP, CFD, LGALS1, CTSS, S100A8, SERPINA1, LGALS2, SPI1, IFITM3, PSAP 
#>     CFP, SAT1, IFI30, COTL1, S100A11, NPC2, LGALS3, GSTP1, PYCARD, NCF2 
#> PC_ 2 
#> Positive:  CD79A, MS4A1, TCL1A, HLA-DQA1, HLA-DRA, HLA-DQB1, LINC00926, CD79B, HLA-DRB1, CD74 
#>     HLA-DPB1, HLA-DMA, HLA-DQA2, HLA-DRB5, HLA-DPA1, HLA-DMB, FCRLA, HVCN1, LTB, BLNK 
#>     KIAA0125, P2RX5, IRF8, IGLL5, SWAP70, ARHGAP24, SMIM14, PPP1R14A, FCRL2, C16orf74 
#> Negative:  NKG7, PRF1, CST7, GZMA, GZMB, FGFBP2, CTSW, GNLY, GZMH, SPON2 
#>     CCL4, FCGR3A, CCL5, CD247, XCL2, CLIC3, AKR1C3, SRGN, HOPX, CTSC 
#>     TTC38, S100A4, ANXA1, IL32, IGFBP7, ID2, ACTB, XCL1, APOBEC3G, SAMD3 
#> PC_ 3 
#> Positive:  HLA-DQA1, CD79A, CD79B, HLA-DQB1, HLA-DPB1, CD74, HLA-DPA1, MS4A1, HLA-DRB1, HLA-DRB5 
#>     HLA-DRA, HLA-DQA2, TCL1A, LINC00926, HLA-DMB, HLA-DMA, HVCN1, FCRLA, IRF8, BLNK 
#>     KIAA0125, SMIM14, PLD4, IGLL5, P2RX5, TMSB10, SWAP70, LAT2, MALAT1, IGJ 
#> Negative:  PPBP, PF4, SDPR, SPARC, GNG11, NRGN, GP9, RGS18, TUBB1, CLU 
#>     HIST1H2AC, AP001189.4, ITGA2B, CD9, TMEM40, CA2, PTCRA, ACRBP, MMD, TREML1 
#>     NGFRAP1, F13A1, RUFY1, SEPT5, MPP1, CMTM5, TSC22D1, MYL9, RP11-367G6.3, GP1BA 
#> PC_ 4 
#> Positive:  HLA-DQA1, CD79A, CD79B, HIST1H2AC, HLA-DQB1, PF4, MS4A1, SDPR, CD74, PPBP 
#>     HLA-DPB1, GNG11, HLA-DQA2, SPARC, HLA-DRB1, HLA-DPA1, GP9, TCL1A, HLA-DRA, LINC00926 
#>     NRGN, RGS18, HLA-DRB5, PTCRA, CD9, AP001189.4, CA2, CLU, TUBB1, ITGA2B 
#> Negative:  VIM, S100A8, S100A6, S100A4, S100A9, TMSB10, IL32, GIMAP7, LGALS2, S100A10 
#>     RBP7, FCN1, MAL, LYZ, S100A12, MS4A6A, CD2, FYB, S100A11, FOLR3 
#>     GIMAP4, AQP3, ANXA1, AIF1, MALAT1, GIMAP5, IL8, IFI6, TRABD2A, TMSB4X 
#> PC_ 5 
#> Positive:  GZMB, FGFBP2, NKG7, GNLY, PRF1, CCL4, CST7, SPON2, GZMA, CLIC3 
#>     GZMH, XCL2, CTSW, TTC38, AKR1C3, CCL5, IGFBP7, XCL1, CCL3, S100A8 
#>     TYROBP, HOPX, CD160, HAVCR2, S100A9, FCER1G, PTGDR, LGALS2, RBP7, S100A12 
#> Negative:  LTB, VIM, AQP3, PPA1, MAL, KIAA0101, CD2, CYTIP, CORO1B, FYB 
#>     IL32, TRADD, ANXA5, TUBA1B, HN1, TYMS, PTGES3, ITM2A, COTL1, GPR183 
#>     TNFAIP8, ACTG1, TRAF3IP3, ATP5C1, GIMAP4, ZWINT, PRDX1, LDLRAP1, ABRACL, NGFRAP1

pbmc_var_all <- RunPCA(object = pbmc_all, features = VariableFeatures(pbmc_all), seed.use = 42)
#> PC_ 1 
#> Positive:  MALAT1, LTB, IL32, CD2, ACAP1, STK17A, CTSW, CD247, CCL5, GIMAP5 
#>     AQP3, GZMA, CST7, TRAF3IP3, MAL, HOPX, ITM2A, GZMK, MYC, BEX2 
#>     GIMAP7, ETS1, LDLRAP1, ZAP70, LYAR, RIC3, TNFAIP8, KLRG1, SAMD3, NKG7 
#> Negative:  CST3, TYROBP, LST1, AIF1, FTL, FCN1, LYZ, FTH1, S100A9, FCER1G 
#>     TYMP, CFD, LGALS1, CTSS, S100A8, SERPINA1, LGALS2, SPI1, IFITM3, PSAP 
#>     CFP, SAT1, IFI30, COTL1, S100A11, NPC2, LGALS3, GSTP1, PYCARD, NCF2 
#> PC_ 2 
#> Positive:  CD79A, MS4A1, TCL1A, HLA-DQA1, HLA-DRA, HLA-DQB1, LINC00926, CD79B, HLA-DRB1, CD74 
#>     HLA-DPB1, HLA-DMA, HLA-DQA2, HLA-DRB5, HLA-DPA1, HLA-DMB, FCRLA, HVCN1, LTB, BLNK 
#>     KIAA0125, P2RX5, IRF8, IGLL5, SWAP70, ARHGAP24, SMIM14, PPP1R14A, FCRL2, C16orf74 
#> Negative:  NKG7, PRF1, CST7, GZMA, GZMB, FGFBP2, CTSW, GNLY, GZMH, SPON2 
#>     CCL4, FCGR3A, CCL5, CD247, XCL2, CLIC3, AKR1C3, SRGN, HOPX, CTSC 
#>     TTC38, S100A4, ANXA1, IL32, IGFBP7, ID2, ACTB, XCL1, APOBEC3G, SAMD3 
#> PC_ 3 
#> Positive:  HLA-DQA1, CD79A, CD79B, HLA-DQB1, HLA-DPB1, CD74, HLA-DPA1, MS4A1, HLA-DRB1, HLA-DRB5 
#>     HLA-DRA, HLA-DQA2, TCL1A, LINC00926, HLA-DMB, HLA-DMA, HVCN1, FCRLA, IRF8, BLNK 
#>     KIAA0125, SMIM14, PLD4, IGLL5, P2RX5, TMSB10, SWAP70, LAT2, MALAT1, IGJ 
#> Negative:  PPBP, PF4, SDPR, SPARC, GNG11, NRGN, GP9, RGS18, TUBB1, CLU 
#>     HIST1H2AC, AP001189.4, ITGA2B, CD9, TMEM40, CA2, PTCRA, ACRBP, MMD, TREML1 
#>     NGFRAP1, F13A1, RUFY1, SEPT5, MPP1, CMTM5, TSC22D1, MYL9, RP11-367G6.3, GP1BA 
#> PC_ 4 
#> Positive:  HLA-DQA1, CD79A, CD79B, HIST1H2AC, HLA-DQB1, PF4, MS4A1, SDPR, CD74, PPBP 
#>     HLA-DPB1, GNG11, HLA-DQA2, SPARC, HLA-DRB1, HLA-DPA1, GP9, TCL1A, HLA-DRA, LINC00926 
#>     NRGN, RGS18, HLA-DRB5, PTCRA, CD9, AP001189.4, CA2, CLU, TUBB1, ITGA2B 
#> Negative:  VIM, S100A8, S100A6, S100A4, S100A9, TMSB10, IL32, GIMAP7, LGALS2, S100A10 
#>     RBP7, FCN1, MAL, LYZ, S100A12, MS4A6A, CD2, FYB, S100A11, FOLR3 
#>     GIMAP4, AQP3, ANXA1, AIF1, MALAT1, GIMAP5, IL8, IFI6, TRABD2A, TMSB4X 
#> PC_ 5 
#> Positive:  GZMB, FGFBP2, NKG7, GNLY, PRF1, CCL4, CST7, SPON2, GZMA, CLIC3 
#>     GZMH, XCL2, CTSW, TTC38, AKR1C3, CCL5, IGFBP7, XCL1, CCL3, S100A8 
#>     TYROBP, HOPX, CD160, HAVCR2, S100A9, FCER1G, PTGDR, LGALS2, RBP7, S100A12 
#> Negative:  LTB, VIM, AQP3, PPA1, MAL, KIAA0101, CD2, CYTIP, CORO1B, FYB 
#>     IL32, TRADD, ANXA5, TUBA1B, HN1, TYMS, PTGES3, ITM2A, COTL1, GPR183 
#>     TNFAIP8, ACTG1, TRAF3IP3, ATP5C1, GIMAP4, ZWINT, PRDX1, LDLRAP1, ABRACL, NGFRAP1
pbmc_default_all <- RunPCA(object = pbmc_all, seed.use = 42)
#> PC_ 1 
#> Positive:  MALAT1, LTB, IL32, CD2, ACAP1, STK17A, CTSW, CD247, CCL5, GIMAP5 
#>     AQP3, GZMA, CST7, TRAF3IP3, MAL, HOPX, ITM2A, GZMK, MYC, BEX2 
#>     GIMAP7, ETS1, LDLRAP1, ZAP70, LYAR, RIC3, TNFAIP8, KLRG1, SAMD3, NKG7 
#> Negative:  CST3, TYROBP, LST1, AIF1, FTL, FCN1, LYZ, FTH1, S100A9, FCER1G 
#>     TYMP, CFD, LGALS1, CTSS, S100A8, SERPINA1, LGALS2, SPI1, IFITM3, PSAP 
#>     CFP, SAT1, IFI30, COTL1, S100A11, NPC2, LGALS3, GSTP1, PYCARD, NCF2 
#> PC_ 2 
#> Positive:  CD79A, MS4A1, TCL1A, HLA-DQA1, HLA-DRA, HLA-DQB1, LINC00926, CD79B, HLA-DRB1, CD74 
#>     HLA-DPB1, HLA-DMA, HLA-DQA2, HLA-DRB5, HLA-DPA1, HLA-DMB, FCRLA, HVCN1, LTB, BLNK 
#>     KIAA0125, P2RX5, IRF8, IGLL5, SWAP70, ARHGAP24, SMIM14, PPP1R14A, FCRL2, C16orf74 
#> Negative:  NKG7, PRF1, CST7, GZMA, GZMB, FGFBP2, CTSW, GNLY, GZMH, SPON2 
#>     CCL4, FCGR3A, CCL5, CD247, XCL2, CLIC3, AKR1C3, SRGN, HOPX, CTSC 
#>     TTC38, S100A4, ANXA1, IL32, IGFBP7, ID2, ACTB, XCL1, APOBEC3G, SAMD3 
#> PC_ 3 
#> Positive:  HLA-DQA1, CD79A, CD79B, HLA-DQB1, HLA-DPB1, CD74, HLA-DPA1, MS4A1, HLA-DRB1, HLA-DRB5 
#>     HLA-DRA, HLA-DQA2, TCL1A, LINC00926, HLA-DMB, HLA-DMA, HVCN1, FCRLA, IRF8, BLNK 
#>     KIAA0125, SMIM14, PLD4, IGLL5, P2RX5, TMSB10, SWAP70, LAT2, MALAT1, IGJ 
#> Negative:  PPBP, PF4, SDPR, SPARC, GNG11, NRGN, GP9, RGS18, TUBB1, CLU 
#>     HIST1H2AC, AP001189.4, ITGA2B, CD9, TMEM40, CA2, PTCRA, ACRBP, MMD, TREML1 
#>     NGFRAP1, F13A1, RUFY1, SEPT5, MPP1, CMTM5, TSC22D1, MYL9, RP11-367G6.3, GP1BA 
#> PC_ 4 
#> Positive:  HLA-DQA1, CD79A, CD79B, HIST1H2AC, HLA-DQB1, PF4, MS4A1, SDPR, CD74, PPBP 
#>     HLA-DPB1, GNG11, HLA-DQA2, SPARC, HLA-DRB1, HLA-DPA1, GP9, TCL1A, HLA-DRA, LINC00926 
#>     NRGN, RGS18, HLA-DRB5, PTCRA, CD9, AP001189.4, CA2, CLU, TUBB1, ITGA2B 
#> Negative:  VIM, S100A8, S100A6, S100A4, S100A9, TMSB10, IL32, GIMAP7, LGALS2, S100A10 
#>     RBP7, FCN1, MAL, LYZ, S100A12, MS4A6A, CD2, FYB, S100A11, FOLR3 
#>     GIMAP4, AQP3, ANXA1, AIF1, MALAT1, GIMAP5, IL8, IFI6, TRABD2A, TMSB4X 
#> PC_ 5 
#> Positive:  GZMB, FGFBP2, NKG7, GNLY, PRF1, CCL4, CST7, SPON2, GZMA, CLIC3 
#>     GZMH, XCL2, CTSW, TTC38, AKR1C3, CCL5, IGFBP7, XCL1, CCL3, S100A8 
#>     TYROBP, HOPX, CD160, HAVCR2, S100A9, FCER1G, PTGDR, LGALS2, RBP7, S100A12 
#> Negative:  LTB, VIM, AQP3, PPA1, MAL, KIAA0101, CD2, CYTIP, CORO1B, FYB 
#>     IL32, TRADD, ANXA5, TUBA1B, HN1, TYMS, PTGES3, ITM2A, COTL1, GPR183 
#>     TNFAIP8, ACTG1, TRAF3IP3, ATP5C1, GIMAP4, ZWINT, PRDX1, LDLRAP1, ABRACL, NGFRAP1

# Check Results
dim(Loadings(pbmc_var, reduction = "pca"))
#> [1] 2000   50
dim(Loadings(pbmc_default, reduction = "pca"))
#> [1] 2000   50
dim(Loadings(pbmc_var_all, reduction = "pca"))
#> [1] 2000   50
dim(Loadings(pbmc_default_all, reduction = "pca"))
#> [1] 2000   50

identical(Loadings(pbmc_var, reduction = "pca"), Loadings(pbmc_default, reduction = "pca"))
#> [1] TRUE
identical(Loadings(pbmc_var, reduction = "pca"), Loadings(pbmc_default_all, reduction = "pca"))
#> [1] TRUE
identical(Loadings(pbmc_var, reduction = "pca"), Loadings(pbmc_var_all, reduction = "pca"))
#> [1] TRUE

Created on 2024-05-28 with reprex v2.1.0

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.4.0 (2024-04-24) #> os macOS Monterey 12.7.5 #> system x86_64, darwin20 #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz America/New_York #> date 2024-05-28 #> pandoc 3.1.11 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/x86_64/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> abind 1.4-5 2016-07-21 [1] CRAN (R 4.4.0) #> cli 3.6.2 2023-12-11 [1] CRAN (R 4.4.0) #> cluster 2.1.6 2023-12-01 [1] CRAN (R 4.4.0) #> codetools 0.2-20 2024-03-31 [1] CRAN (R 4.4.0) #> colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.4.0) #> cowplot 1.1.3 2024-01-22 [1] CRAN (R 4.4.0) #> data.table 1.15.4 2024-03-30 [1] CRAN (R 4.4.0) #> deldir 2.0-4 2024-02-28 [1] CRAN (R 4.4.0) #> digest 0.6.35 2024-03-11 [1] CRAN (R 4.4.0) #> dotCall64 1.1-1 2023-11-28 [1] CRAN (R 4.4.0) #> dplyr * 1.1.4 2023-11-17 [1] CRAN (R 4.4.0) #> evaluate 0.23 2023-11-01 [1] CRAN (R 4.4.0) #> fansi 1.0.6 2023-12-08 [1] CRAN (R 4.4.0) #> fastDummies 1.7.3 2023-07-06 [1] CRAN (R 4.4.0) #> fastmap 1.2.0 2024-05-15 [1] CRAN (R 4.4.0) #> fitdistrplus 1.1-11 2023-04-25 [1] CRAN (R 4.4.0) #> forcats * 1.0.0 2023-01-29 [1] CRAN (R 4.4.0) #> fs 1.6.4 2024-04-25 [1] CRAN (R 4.4.0) #> future 1.33.2 2024-03-26 [1] CRAN (R 4.4.0) #> future.apply 1.11.2 2024-03-28 [1] CRAN (R 4.4.0) #> generics 0.1.3 2022-07-05 [1] CRAN (R 4.4.0) #> ggplot2 * 3.5.1 2024-04-23 [1] CRAN (R 4.4.0) #> ggrepel 0.9.5 2024-01-10 [1] CRAN (R 4.4.0) #> ggridges 0.5.6 2024-01-23 [1] CRAN (R 4.4.0) #> globals 0.16.3 2024-03-08 [1] CRAN (R 4.4.0) #> glue 1.7.0 2024-01-09 [1] CRAN (R 4.4.0) #> goftest 1.2-3 2021-10-07 [1] CRAN (R 4.4.0) #> gridExtra 2.3 2017-09-09 [1] CRAN (R 4.4.0) #> gtable 0.3.5 2024-04-22 [1] CRAN (R 4.4.0) #> hms 1.1.3 2023-03-21 [1] CRAN (R 4.4.0) #> htmltools 0.5.8.1 2024-04-04 [1] CRAN (R 4.4.0) #> htmlwidgets 1.6.4 2023-12-06 [1] CRAN (R 4.4.0) #> httpuv 1.6.15 2024-03-26 [1] CRAN (R 4.4.0) #> httr 1.4.7 2023-08-15 [1] CRAN (R 4.4.0) #> ica 1.0-3 2022-07-08 [1] CRAN (R 4.4.0) #> igraph 2.0.3 2024-03-13 [1] CRAN (R 4.4.0) #> irlba 2.3.5.1 2022-10-03 [1] CRAN (R 4.4.0) #> jsonlite 1.8.8 2023-12-04 [1] CRAN (R 4.4.0) #> KernSmooth 2.23-24 2024-05-17 [1] CRAN (R 4.4.0) #> knitr 1.46 2024-04-06 [1] CRAN (R 4.4.0) #> later 1.3.2 2023-12-06 [1] CRAN (R 4.4.0) #> lattice 0.22-6 2024-03-20 [1] CRAN (R 4.4.0) #> lazyeval 0.2.2 2019-03-15 [1] CRAN (R 4.4.0) #> leiden 0.4.3.1 2023-11-17 [1] CRAN (R 4.4.0) #> lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.4.0) #> listenv 0.9.1 2024-01-29 [1] CRAN (R 4.4.0) #> lmtest 0.9-40 2022-03-21 [1] CRAN (R 4.4.0) #> lubridate * 1.9.3 2023-09-27 [1] CRAN (R 4.4.0) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.4.0) #> MASS 7.3-60.2 2024-04-24 [1] local #> Matrix 1.7-0 2024-03-22 [1] CRAN (R 4.4.0) #> matrixStats 1.3.0 2024-04-11 [1] CRAN (R 4.4.0) #> mime 0.12 2021-09-28 [1] CRAN (R 4.4.0) #> miniUI 0.1.1.1 2018-05-18 [1] CRAN (R 4.4.0) #> munsell 0.5.1 2024-04-01 [1] CRAN (R 4.4.0) #> nlme 3.1-164 2023-11-27 [1] CRAN (R 4.4.0) #> parallelly 1.37.1 2024-02-29 [1] CRAN (R 4.4.0) #> patchwork 1.2.0 2024-01-08 [1] CRAN (R 4.4.0) #> pbapply 1.7-2 2023-06-27 [1] CRAN (R 4.4.0) #> pbmc3k.SeuratData 3.1.4 2024-05-01 [1] local #> pillar 1.9.0 2023-03-22 [1] CRAN (R 4.4.0) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.4.0) #> plotly 4.10.4 2024-01-13 [1] CRAN (R 4.4.0) #> plyr 1.8.9 2023-10-02 [1] CRAN (R 4.4.0) #> png 0.1-8 2022-11-29 [1] CRAN (R 4.4.0) #> polyclip 1.10-6 2023-09-27 [1] CRAN (R 4.4.0) #> progressr 0.14.0 2023-08-10 [1] CRAN (R 4.4.0) #> promises 1.3.0 2024-04-05 [1] CRAN (R 4.4.0) #> purrr * 1.0.2 2023-08-10 [1] CRAN (R 4.4.0) #> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.4.0) #> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.4.0) #> R.oo 1.26.0 2024-01-24 [1] CRAN (R 4.4.0) #> R.utils 2.12.3 2023-11-18 [1] CRAN (R 4.4.0) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.4.0) #> RANN 2.6.1 2019-01-08 [1] CRAN (R 4.4.0) #> RColorBrewer 1.1-3 2022-04-03 [1] CRAN (R 4.4.0) #> Rcpp 1.0.12 2024-01-09 [1] CRAN (R 4.4.0) #> RcppAnnoy 0.0.22 2024-01-23 [1] CRAN (R 4.4.0) #> RcppHNSW 0.6.0 2024-02-04 [1] CRAN (R 4.4.0) #> readr * 2.1.5 2024-01-10 [1] CRAN (R 4.4.0) #> reprex 2.1.0 2024-01-11 [1] CRAN (R 4.4.0) #> reshape2 1.4.4 2020-04-09 [1] CRAN (R 4.4.0) #> reticulate 1.37.0 2024-05-21 [1] CRAN (R 4.4.0) #> rlang 1.1.3 2024-01-10 [1] CRAN (R 4.4.0) #> rmarkdown 2.27 2024-05-17 [1] CRAN (R 4.4.0) #> ROCR 1.0-11 2020-05-02 [1] CRAN (R 4.4.0) #> RSpectra 0.16-1 2022-04-24 [1] CRAN (R 4.4.0) #> rstudioapi 0.16.0 2024-03-24 [1] CRAN (R 4.4.0) #> Rtsne 0.17 2023-12-07 [1] CRAN (R 4.4.0) #> scales 1.3.0 2023-11-28 [1] CRAN (R 4.4.0) #> scattermore 1.2 2023-06-12 [1] CRAN (R 4.4.0) #> sctransform 0.4.1 2023-10-19 [1] CRAN (R 4.4.0) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.4.0) #> Seurat * 5.1.0 2024-05-10 [1] CRAN (R 4.4.0) #> SeuratObject * 5.0.2 2024-05-08 [1] CRAN (R 4.4.0) #> shiny 1.8.1.1 2024-04-02 [1] CRAN (R 4.4.0) #> sp * 2.1-4 2024-04-30 [1] CRAN (R 4.4.0) #> spam 2.10-0 2023-10-23 [1] CRAN (R 4.4.0) #> spatstat.data 3.0-4 2024-01-15 [1] CRAN (R 4.4.0) #> spatstat.explore 3.2-7 2024-03-21 [1] CRAN (R 4.4.0) #> spatstat.geom 3.2-9 2024-02-28 [1] CRAN (R 4.4.0) #> spatstat.random 3.2-3 2024-02-29 [1] CRAN (R 4.4.0) #> spatstat.sparse 3.0-3 2023-10-24 [1] CRAN (R 4.4.0) #> spatstat.utils 3.0-4 2023-10-24 [1] CRAN (R 4.4.0) #> stringi 1.8.4 2024-05-06 [1] CRAN (R 4.4.0) #> stringr * 1.5.1 2023-11-14 [1] CRAN (R 4.4.0) #> styler 1.10.3 2024-04-07 [1] CRAN (R 4.4.0) #> survival 3.6-4 2024-04-24 [1] CRAN (R 4.4.0) #> tensor 1.5 2012-05-05 [1] CRAN (R 4.4.0) #> tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.4.0) #> tidyr * 1.3.1 2024-01-24 [1] CRAN (R 4.4.0) #> tidyselect 1.2.1 2024-03-11 [1] CRAN (R 4.4.0) #> tidyverse * 2.0.0 2023-02-22 [1] CRAN (R 4.4.0) #> timechange 0.3.0 2024-01-18 [1] CRAN (R 4.4.0) #> tzdb 0.4.0 2023-05-12 [1] CRAN (R 4.4.0) #> utf8 1.2.4 2023-10-22 [1] CRAN (R 4.4.0) #> uwot 0.2.2 2024-04-21 [1] CRAN (R 4.4.0) #> vctrs 0.6.5 2023-12-01 [1] CRAN (R 4.4.0) #> viridisLite 0.4.2 2023-05-02 [1] CRAN (R 4.4.0) #> withr 3.0.0 2024-01-16 [1] CRAN (R 4.4.0) #> xfun 0.44 2024-05-15 [1] CRAN (R 4.4.0) #> xtable 1.8-4 2019-04-21 [1] CRAN (R 4.4.0) #> yaml 2.3.8 2023-12-11 [1] CRAN (R 4.4.0) #> zoo 1.8-12 2023-04-13 [1] CRAN (R 4.4.0) #> #> [1] /Library/Frameworks/R.framework/Versions/4.4-x86_64/Resources/library #> #> ────────────────────────────────────────────────────────────────────────────── ```
samuel-marsh commented 3 months ago

Also this line in source code indicates why this is the case. Because if default is left unchanged features = NULL then Seurat will take the variable features to use as features using same command: https://github.com/satijalab/seurat/blob/1549dcb3075eaeac01c925c4b4bb73c73450fc50/R/dimensional_reduction.R#L2394

bellenger-l commented 3 months ago

Thanks @samuel-marsh ! I will update my version and see if I overcome this issue !

Seurat will take the variable features to use as features using same command

I agree ! This is why, I didn't understand why I had different results for the same asked task :)

Thanks a lot for your help, If upgrading Seurat to 5.1.0 solved this, I will close this issue !

bellenger-l commented 3 months ago

Seurat 5.1.0 and SeuratObject 5.0.2 didn't fix the issue, maybe it's a R version problem but I work on a server, so I let the issue opened if somebody have another solution :)

samuel-marsh commented 3 months ago

@bellenger-l have you been testing use SeuratData pbmc3k data (as in my code above) or using your own object?

Best, Sam

longmanz commented 2 months ago

Hi @bellenger-l, We are also not able to replicate this issue from our end (many thanks to @samuel-marsh 's help). Please let us know if the issue still persists if you use pbmc3k to just your own object. Also, we will be closing this issue for now since there is no update in three weeks. Please feel free to re-open the issue and share us any update you have. Thank you!