Open florekem opened 1 month ago
If I run the same query with the default reference:
ref <- load.reference.map()
ref
it works?
> seurat_pre_t <- ProjecTILs.classifier(
+ seurat_pre_t, ref,
+ split.by = "patient_id" # For optimal batch-effect correction, we recommend projecting each patient/batch separately (split.by) #nolint
+ )
| | 0%[1] "Using assay RNA for query"
Normalizing layer: counts
Performing log-normalization
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Pre-filtering cells with scGate...
### Detected a total of 986 pure 'Target' cells (98.60% of total)
[1] "14 out of 1000 ( 1% ) non-pure cells removed. Use filter.cells=FALSE to avoid pre-filtering"
[1] "Transforming expression matrix into space of orthologs"
[1] "Aligning query to reference map for batch-correction..."
Warning: Layer counts isn't present in the assay object[[assay]]; returning NULL
Warning: Layer counts isn't present in the assay object[[assay]]; returning NULL
Preparing PCA embeddings for objects...
Warning: Number of dimensions changing from 50 to 15
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=08s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=06s
Projecting corrected query onto Reference PCA space
Projecting corrected query onto Reference UMAP space
|======================================================================| 100%
Creating slots functional.cluster and functional.cluster.conf in query object
| | 0%[1] "Using assay RNA for query"
Normalizing layer: counts
Performing log-normalization
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Pre-filtering cells with scGate...
### Detected a total of 968 pure 'Target' cells (96.80% of total)
[1] "32 out of 1000 ( 3% ) non-pure cells removed. Use filter.cells=FALSE to avoid pre-filtering"
[1] "Transforming expression matrix into space of orthologs"
[1] "Aligning query to reference map for batch-correction..."
Warning: Layer counts isn't present in the assay object[[assay]]; returning NULL
Warning: Layer counts isn't present in the assay object[[assay]]; returning NULL
Preparing PCA embeddings for objects...
Warning: Number of dimensions changing from 50 to 15
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=08s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=06s
Projecting corrected query onto Reference PCA space
Projecting corrected query onto Reference UMAP space
|======================================================================| 100%
Creating slots functional.cluster and functional.cluster.conf in query object
| | 0%[1] "Using assay RNA for query"
Normalizing layer: counts
Performing log-normalization
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Pre-filtering cells with scGate...
^C
Hello Maciej,
your code is a bit different from that of the tutorial. How did you change the object prior to ProjecTILs? does it work with the original code?
Hi Massimo, thank you for your fast response! I believe, the only thing I've changed are variable names... but I will look into it and I'll run the original code as well.
Meanwhile, here is my code: https://github.com/florekem/singleCell_LAB/blob/main/scREP_projectTIL_tut/scripts/script2.R
Hello!
I've rerun the script with the original code, but nothing has changed. What am I doing wrong?
Actually, I've got simillar error with https://carmonalab.github.io/ProjecTILs_CaseStudies/Xiong19_TCR.html case study, but here I've modified the code, since I've found the guide outdated. So maybe it is a package problem situation. On the other hand, "Test the package" code is working correctly form what I can tell (without errors). Thank you for your time!
code
library(patchwork)
library(ggplot2)
library(reshape2)
library(Seurat)
library(ProjecTILs)
options(timeout = 5000)
ddir <- "input/Bassez_breast_data"
dir.create(ddir)
dataUrl <- "https://figshare.com/ndownloader/files/47900725"
download.file(dataUrl, paste0(ddir, "/tmp.zip"))
unzip(paste0(ddir, "/tmp.zip"), exdir = ddir)
file.remove(paste0(ddir, "/tmp.zip"))
# Count matrices
f1 <- sprintf("%s/1864-counts_tcell_cohort1.rds", ddir)
cohort1 <- readRDS(f1)
dim(cohort1)
meta1 <- read.csv(sprintf("%s/1870-BIOKEY_metaData_tcells_cohort1_web.csv", ddir))
rownames(meta1) <- meta1$Cell
data.seurat <- CreateSeuratObject(cohort1, project = "Cohort1_IT", meta.data = meta1)
data.seurat <- NormalizeData(data.seurat)
data.seurat <- subset(data.seurat, subset = timepoint == "Pre")
data.seurat <- subset(data.seurat, subset = cellSubType %in% c(
"NK_REST", "Vg9Vd2_gdT",
"gdT", "NK_CYTO"
), invert = T)
ds <- 1000
min.cells <- 200
tab <- table(data.seurat$patient_id)
keep <- names(tab)[tab > min.cells]
data.seurat <- subset(data.seurat, patient_id %in% keep)
Idents(data.seurat) <- "patient_id"
data.seurat <- subset(data.seurat, cells = WhichCells(data.seurat, downsample = ds))
table(data.seurat$patient_id)
download.file("https://figshare.com/ndownloader/files/41414556", destfile = "CD8T_human_ref_v1.rds")
ref.cd8 <- load.reference.map("CD8T_human_ref_v1.rds")
ncores <- 8
DefaultAssay(ref.cd8) <- "integrated"
data.seurat <- ProjecTILs.classifier(data.seurat, ref.cd8, ncores = ncores, split.by = "patient_id")
R console
> dir.create(ddir)
> dataUrl <- "https://figshare.com/ndownloader/files/47900725"
> download.file(dataUrl, paste0(ddir, "/tmp.zip"))
trying URL 'https://figshare.com/ndownloader/files/47900725'
Content type 'application/zip' length 161147319 bytes (153.7 MB)
==================================================
downloaded 153.7 MB
> unzip(paste0(ddir, "/tmp.zip"), exdir = ddir)
> file.remove(paste0(ddir, "/tmp.zip"))
[1] TRUE
> # Count matrices
> f1 <- sprintf("%s/1864-counts_tcell_cohort1.rds", ddir)
> cohort1 <- readRDS(f1)
> dim(cohort1)
[1] 25288 53382
> meta1 <- read.csv(sprintf("%s/1870-BIOKEY_metaData_tcells_cohort1_web.csv", ddir))
> rownames(meta1) <- meta1$Cell
> data.seurat <- CreateSeuratObject(cohort1, project = "Cohort1_IT", meta.data = meta1)
> data.seurat <- NormalizeData(data.seurat)
Normalizing layer: counts
Performing log-normalization
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
> data.seurat <- subset(data.seurat, subset = timepoint == "Pre")
> data.seurat <- subset(data.seurat, subset = cellSubType %in% c(
+ "NK_REST", "Vg9Vd2_gdT",
+ "gdT", "NK_CYTO"
+ ), invert = T)
> ds <- 1000
> min.cells <- 200
> tab <- table(data.seurat$patient_id)
> keep <- names(tab)[tab > min.cells]
> data.seurat <- subset(data.seurat, patient_id %in% keep)
> Idents(data.seurat) <- "patient_id"
> data.seurat <- subset(data.seurat, cells = WhichCells(data.seurat, downsample = ds))
> table(data.seurat$patient_id)
BIOKEY_1 BIOKEY_10 BIOKEY_11 BIOKEY_12 BIOKEY_13 BIOKEY_14 BIOKEY_15 BIOKEY_16
1000 1000 555 1000 1000 939 1000 1000
BIOKEY_19 BIOKEY_2 BIOKEY_21 BIOKEY_24 BIOKEY_27 BIOKEY_28 BIOKEY_3 BIOKEY_31
1000 960 203 347 601 594 266 349
BIOKEY_4 BIOKEY_5 BIOKEY_6
1000 1000 616
> download.file("https://figshare.com/ndownloader/files/41414556", destfile = "CD8T_human_ref_v1.rds")
trying URL 'https://figshare.com/ndownloader/files/41414556'
Content type 'application/octet-stream' length 257797783 bytes (245.9 MB)
==================================================
downloaded 245.9 MB
> ref.cd8 <- load.reference.map("CD8T_human_ref_v1.rds")
[1] "Loading Custom Reference Atlas..."
[1] "Loaded Custom Reference map Human CD8 TILs"
> ncores <- 8
> DefaultAssay(ref.cd8) <- "integrated"
> data.seurat <- ProjecTILs.classifier(data.seurat, ref.cd8, ncores = ncores, split.by = "patient_id")
| | 0%[1] "Using assay RNA for query"
Normalizing layer: counts
Performing log-normalization
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Pre-filtering cells with scGate...
### Detected a total of 143 pure 'Target' cells (41.21% of total)
[1] "204 out of 347 ( 59% ) non-pure cells removed. Use filter.cells=FALSE to avoid pre-filtering"
[1] "Aligning query to reference map for batch-correction..."
Warning: Layer counts isn't present in the assay object[[assay]]; returning NULL
Preparing PCA embeddings for objects...
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=15s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=04s
Projecting corrected query onto Reference PCA space
Projecting corrected query onto Reference UMAP space
|======================================================================| 100%
Stop worker failed with the error: wrong args for environment subassignment
Error: BiocParallel errors
0 remote errors, element index:
19 unevaluated and other errors
first remote error:
That's very odd, I have your same package versions and your code runs fine on my machine.
Since you mentioned that you don't get the same error with the default mouse reference, it's probably not a package version issue; I would narrow it down to the reference object itself. Can you try to download it again? and perhaps try to run ref.cd8 <- UpdateSeuratObject(ref.cd8)
to see if that fixes any slot issues?
I don't know what is going on, but yes it works with 'ref_TILAtlas_mouse_v1.rds'.
On the other hand, when I try the default mouse atlas with a Xiong19_TCR case study, it doesn't work there, with the same error. I've tried it on the other machine with the same results (data downloaded independently multiple times).
Tutorial also working.
Error in `[<-.data.frame`(`*tmp*`, , i, value = c(NA, NA, NA, NA, NA, : replacement has 25288 rows, data has 33156
I am getting this error message as well - the first number (25288) is the number of genes in your query dataset, and the second (33156) is the total number of unique genes between your query and reference sets combined. The only workaround I can find is to subset the genes in both your reference and query datasets to be identical, but a lot of information gets lost. Does this point to any particular mistake I might be making?
Commenting here as I'm also facing a similar error that I didn't have before. I'm using Seurat5 and R version 4.3.1.
I'd like to help solve this, but I haven't been able to reproduce the behavior - even with the same package versions of the original poster.
Do you get the same error if you run the projection function instead of the wrapper?
data.proj <- make.projection(data.seurat, ref.cd8)
Can also someone else post their sessionInfo()
to confirm?
I've also just tried converting to a Seurat4 assay to see if it would work, but same error message persists.
make.projection()
returned the same error message. I'm using ref_LCMV_Atlas_mouse_v1
as my reference here.
code tried:
obj <- as(obj[["RNA"]], Class = "Assay")
test <- make.projection(obj, ref.cd8, query.assay = "RNA")
error message:
Error: BiocParallel errors
1 remote errors, element index: 1
0 unevaluated and other errors
first remote error:
Error in slot(object = object, name = s): no slot of name "images" for this object of class "Seurat"
sessionInfo()
R version 4.3.1 (2023-06-16)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
Matrix products: default
BLAS: /config/binaries/R/4.3.1.Core/lib64/R/lib/libRblas.so
LAPACK: /config/binaries/R/4.3.1.Core/lib64/R/lib/libRlapack.so; LAPACK version 3.11.0
locale:
[1] LC_CTYPE=en_AU.UTF-8 LC_NUMERIC=C LC_TIME=en_AU.UTF-8 LC_COLLATE=en_AU.UTF-8
[5] LC_MONETARY=en_AU.UTF-8 LC_MESSAGES=en_AU.UTF-8 LC_PAPER=en_AU.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C
time zone: Australia/Melbourne
tzcode source: system (glibc)
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] RColorBrewer_1.1-3 UCell_2.8.0 umap_0.2.10.0 ProjecTILs_3.4.2
[5] ggsci_3.2.0 scGate_1.6.2 cowplot_1.1.3 clustree_0.5.1
[9] ggraph_2.2.1 ggpubr_0.6.0 patchwork_1.2.0 scDblFinder_1.16.0
[13] SingleCellExperiment_1.24.0 SummarizedExperiment_1.32.0 Biobase_2.62.0 GenomicRanges_1.54.1
[17] GenomeInfoDb_1.38.1 IRanges_2.36.0 S4Vectors_0.40.2 BiocGenerics_0.48.1
[21] MatrixGenerics_1.14.0 matrixStats_1.3.0 Seurat_5.1.0 SeuratObject_5.0.2
[25] sp_2.1-4 lubridate_1.9.3 forcats_1.0.0 stringr_1.5.1
[29] dplyr_1.1.4 purrr_1.0.2 readr_2.1.5 tidyr_1.3.1
[33] tibble_3.2.1 ggplot2_3.5.1 tidyverse_2.0.0
loaded via a namespace (and not attached):
[1] spatstat.sparse_3.1-0 bitops_1.0-7 httr_1.4.7 tools_4.3.1
[5] sctransform_0.4.1 backports_1.5.0 utf8_1.2.4 R6_2.5.1
[9] lazyeval_0.2.2 uwot_0.2.2 withr_3.0.0 gridExtra_2.3
[13] progressr_0.14.0 cli_3.6.3 spatstat.explore_3.2-7 fastDummies_1.7.3
[17] dittoSeq_1.14.3 spatstat.data_3.1-2 ggridges_0.5.6 pbapply_1.7-2
[21] askpass_1.2.0 Rsamtools_2.18.0 R.utils_2.12.3 scater_1.30.1
[25] parallelly_1.37.1 limma_3.58.1 rstudioapi_0.16.0 generics_0.1.3
[29] BiocIO_1.12.0 ica_1.0-3 spatstat.random_3.2-3 car_3.1-2
[33] Matrix_1.6-4 ggbeeswarm_0.7.2 fansi_1.0.6 abind_1.4-5
[37] R.methodsS3_1.8.2 lifecycle_1.0.4 yaml_2.3.8 edgeR_4.0.16
[41] carData_3.0-5 SparseArray_1.2.2 Rtsne_0.17 grid_4.3.1
[45] promises_1.3.0 dqrng_0.4.1 crayon_1.5.3 miniUI_0.1.1.1
[49] lattice_0.21-8 beachmat_2.18.1 pillar_1.9.0 knitr_1.47
[53] metapod_1.10.1 rjson_0.2.21 xgboost_1.7.7.1 future.apply_1.11.2
[57] codetools_0.2-19 leiden_0.4.3.1 glue_1.7.0 data.table_1.15.4
[61] vctrs_0.6.5 png_0.1-8 spam_2.10-0 gtable_0.3.5
[65] cachem_1.1.0 xfun_0.45 S4Arrays_1.2.0 mime_0.12
[69] tidygraph_1.3.1 pracma_2.4.4 survival_3.5-5 pheatmap_1.0.12
[73] statmod_1.5.0 bluster_1.12.0 fitdistrplus_1.1-11 ROCR_1.0-11
[77] nlme_3.1-162 RcppAnnoy_0.0.22 irlba_2.3.5.1 vipor_0.4.7
[81] KernSmooth_2.23-21 STACAS_2.2.2 colorspace_2.1-0 tidyselect_1.2.1
[85] compiler_4.3.1 BiocNeighbors_1.20.2 DelayedArray_0.28.0 plotly_4.10.4
[89] rtracklayer_1.62.0 scales_1.3.0 lmtest_0.9-40 digest_0.6.36
[93] goftest_1.2-3 spatstat.utils_3.0-5 rmarkdown_2.27 XVector_0.42.0
[97] htmltools_0.5.8.1 pkgconfig_2.0.3 sparseMatrixStats_1.14.0 fastmap_1.2.0
[101] rlang_1.1.4 htmlwidgets_1.6.4 shiny_1.8.1.1 DelayedMatrixStats_1.24.0
[105] farver_2.1.2 zoo_1.8-12 jsonlite_1.8.8 BiocParallel_1.36.0
[109] R.oo_1.26.0 BiocSingular_1.18.0 RCurl_1.98-1.14 magrittr_2.0.3
[113] scuttle_1.12.0 GenomeInfoDbData_1.2.11 dotCall64_1.1-1 munsell_0.5.1
[117] Rcpp_1.0.12 viridis_0.6.5 reticulate_1.38.0 stringi_1.8.4
[121] zlibbioc_1.48.0 MASS_7.3-60 plyr_1.8.9 parallel_4.3.1
[125] listenv_0.9.1 ggrepel_0.9.5 deldir_2.0-4 Biostrings_2.70.1
[129] graphlayouts_1.1.0 splines_4.3.1 tensor_1.5 hms_1.1.3
[133] locfit_1.5-9.9 igraph_2.0.3 spatstat.geom_3.2-9 ggsignif_0.6.4
[137] RcppHNSW_0.6.0 reshape2_1.4.4 ScaledMatrix_1.8.1 XML_3.99-0.15
[141] evaluate_0.24.0 scran_1.28.2 tzdb_0.4.0 tweenr_2.0.2
[145] httpuv_1.6.15 RANN_2.6.1 openssl_2.2.0 polyclip_1.10-6
[149] future_1.33.2 scattermore_1.2 ggforce_0.4.2 rsvd_1.0.5
[153] broom_1.0.6 xtable_1.8-4 restfulr_0.0.15 RSpectra_0.16-1
[157] rstatix_0.7.2 later_1.3.2 viridisLite_0.4.2 memoise_2.0.1
[161] beeswarm_0.4.0 GenomicAlignments_1.38.0 cluster_2.1.4 timechange_0.3.0
[165] globals_0.16.3
Wanted to add this - following Workaround#1 in this issue: https://github.com/satijalab/seurat/issues/7691
code tried with ref_LCMV_Atlas_mouse_v1
(this now works):
obj@images <- list()
ref.cd8@images <- list()
obj <- as(obj[["RNA"]], Class = "Assay")
test <- make.projection(obj, ref.cd8, query.assay = "RNA")
doesn't work when I'm using my own reference generated with make.reference()
:
obj@images <- list()
own.ref@images <- list()
test_other <- make.projection(obj, own.ref, query.assay = "RNA")
error message:
| | 0%[1] "Using assay RNA for query"
Normalizing layer: counts
Performing log-normalization
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Pre-filtering cells with scGate...
No scGate model specified: all cells will be projected
[1] "Warning! more than 20% of variable genes not found in the query"
[1] "Aligning query to reference map for batch-correction..."
Preparing PCA embeddings for objects...
Warning: Number of dimensions changing from 50 to 20
Warning: Number of dimensions changing from 50 to 20
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=00s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=03s
Projecting corrected query onto Reference PCA space
Projecting corrected query onto Reference UMAP space
|===============================================================================================================================| 100%
Error: BiocParallel errors
1 remote errors, element index: 1
0 unevaluated and other errors
first remote error:
Error in `[<-.data.frame`(`*tmp*`, , i, value = c(NA, NA, NA, NA, NA, : replacement has 8640 rows, data has 13913
these are the dim outputs for each Seurat object:
> dim(ref.cd8)
[1] 27998 7000
> dim(own.ref)
[1] 13638 2596
> dim(obj)
[1] 8640 1912
Thanks again for reporting.
I have finally been able to make this error appear. It happens when, for some reason, the assays of the reference maps are set to the Assay5 type. Can you try to set the assay type of the reference to Assay:
assays <- Assays(ref)
for (a in assays) {
ref[[a]] <- as(ref[[a]], Class="Assay")
}
Does this fix the problem? -m
Hmm, make.projection()
works, but not Run.ProjecTILs()
@mass-a.
code tried:
obj[["RNA"]] <- as(obj[["RNA"]], Class = "Assay")
ref[["RNA"]]<- as(ref[["RNA"]], Class = "Assay")
test_other <- make.projection(obj, ref, query.assay = "RNA") #works
test_main_fxn <- Run.ProjecTILs(query = obj, ref = ref, skip.normalize = TRUE, reduction = "pca") # doesn't work
error message from test_main_fxn <- Run.ProjecTILs(query = obj, ref = ref, skip.normalize = TRUE, reduction = "pca")
:
Error: BiocParallel errors
1 remote errors, element index: 1
0 unevaluated and other errors
first remote error:
Error in UseMethod(generic = "JoinLayers", object = object): no applicable method for 'JoinLayers' applied to an object of class "c('Assay', 'KeyMixin')"
I'm following Bassez tutorial. However, during classification step I encounter the following error:
I've removed ncores parameter, but BiocParallel error still persists. I've seen resolved issue with "x rows, data has x" caused probably by v5 compatibility, but here since it is pre-prepared data, I don't suspect v5 problems.
Also seems as duplicate of #91 but I'm am using Seurat 5.1