Closed yueli8 closed 6 months ago
Hey Yue,
Would you mind making a reproducible example for this issue?
It looks like the function works for at least one element of combinedTCR - but not all - so the issue may be in the list elements. Do all the elements of the combinedTCR have contig information?
Thanks, Nick
Hello Nike,
Thank you so much for your quick response!
Unfortunately, it still does not work.
I worked with scRepertoire
on version 2.0.3
. It works well onpositionalEntropy
, but does
not on positionalProperty
.
Thank you in advance for your great help!
Please see attached files.
Best,
Yue
rm(list=ls())
library(immunarch)
library(scRepertoire)
library(tidyverse)
library(SingleCellExperiment)
library(Seurat)
library(powerTCR)
read_mixcr_n_trans<- function(file,...){
df <- read.delim(file)
# transform the cell id {be consist with RNA data}
{
well<- gsub(df$tagValueCELL,pattern = "[AGCT]*-",replacement = "")
hp<- gsub(df$tagValueCELL,pattern = "-[AGCT]*-.*",replacement = "")
rt<- str_extract(df$tagValueCELL,pattern = "-[AGCT]{10}")%>%
str_sub(.,start = 2,end = nchar(.))
}
# add a column named "tagValueCELL" {as normal mixcr output}
df$tagValueCELL<- paste(well,hp,rt,sep ="_" )
return(df)
}
#four samples# Step1: Load MIXCR output -----------------------------------------
TIL_4095_1 <- "TIL_4095_101.csv"
TIL_4095_2 <- "TIL_4095_201.csv"
TIL_4095_3 <- "TIL_4095_301.csv"
TIL_4095_4 <- "TIL_4095_401.csv"
filelist <- c(TIL_4095_1,TIL_4095_2,TIL_4095_3,TIL_4095_4)
samples <- c("S1","S2","S3","S4")
contig_list<- lapply(filelist, function(x) read_mixcr_n_trans(x))
colnames(contig_list[[1]])
# convert to "scRepertoire" style
contig.list <- loadContigs(contig_list, format = "MiXCR")
#combineTCR
combined.TCR <- combineTCR(contig.list,
samples = samples,
removeNA = FALSE,
removeMulti = FALSE,
filterMulti = FALSE)
positionalEntropy(combined.TCR,
chain = "TRB",
aa.length = 20)
#positionalProperty
positionalProperty(combined.TCR[c(1,2)],
chain = "TRB",
aa.length = 20,
method = "Atchley") +
scale_color_manual(values = hcl.colors(5, "inferno"))
Error in propertyFunc[char, ] : subscript out of bounds
In addition: Warning message:
In `[<-.factor`(`*tmp*`, thisvar, value = 0) :
invalid factor level, NA generated
sessionInfo()
R version 4.3.0 (2023-04-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.6 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=zh_CN.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=zh_CN.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=zh_CN.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=zh_CN.UTF-8 LC_IDENTIFICATION=C
time zone: Asia/Shanghai
tzcode source: system (glibc)
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] powerTCR_1.20.0 SeuratObject_4.1.4 Seurat_4.4.0 SingleCellExperiment_1.22.0
[5] SummarizedExperiment_1.30.2 GenomicRanges_1.52.1 GenomeInfoDb_1.36.4 IRanges_2.36.0
[9] S4Vectors_0.40.2 MatrixGenerics_1.12.3 matrixStats_1.3.0 lubridate_1.9.3
[13] forcats_1.0.0 stringr_1.5.1 purrr_1.0.2 readr_2.1.5
[17] tidyr_1.3.1 tibble_3.2.1 tidyverse_2.0.0 scRepertoire_2.0.3
[21] immunarch_0.9.1 patchwork_1.2.0 data.table_1.15.4 dtplyr_1.3.1
[25] dplyr_1.1.4 ggplot2_3.5.1 bigmemory_4.6.4 Biobase_2.60.0
[29] BiocGenerics_0.48.1
loaded via a namespace (and not attached):
[1] spatstat.sparse_3.0-3 bitops_1.0-7 httr_1.4.7 RColorBrewer_1.1-3 doParallel_1.0.17
[6] prabclus_2.3-3 tools_4.3.0 sctransform_0.4.1 backports_1.4.1 vegan_2.6-4
[11] utf8_1.2.4 R6_2.5.1 mgcv_1.9-1 lazyeval_0.2.2 uwot_0.2.2
[16] permute_0.9-7 GetoptLong_1.0.5 withr_3.0.0 sp_2.1-4 gridExtra_2.3
[21] progressr_0.14.0 quantreg_5.97 cli_3.6.2 factoextra_1.0.7 spatstat.explore_3.2-7
[26] iNEXT_3.0.1 network_1.18.2 labeling_0.4.3 diptest_0.77-1 robustbase_0.99-2
[31] spatstat.data_3.0-4 ggridges_0.5.6 pbapply_1.7-2 systemfonts_1.0.6 svglite_2.1.3
[36] stringdist_0.9.12 parallelly_1.37.1 readxl_1.4.3 VGAM_1.1-10 rstudioapi_0.16.0
[41] FNN_1.1.4 generics_0.1.3 shape_1.4.6.1 ica_1.0-3 spatstat.random_3.2-3
[46] car_3.1-2 Matrix_1.6-1.1 fansi_1.0.6 abind_1.4-5 lifecycle_1.0.4
[51] carData_3.0-5 Rtsne_0.17 grid_4.3.0 promises_1.3.0 crayon_1.5.2
[56] miniUI_0.1.1.1 lattice_0.22-6 cowplot_1.1.3 sna_2.7-2 pillar_1.9.0
[61] ComplexHeatmap_2.16.0 rjson_0.2.21 fpc_2.2-12 CellChat_1.6.1 future.apply_1.11.2
[66] codetools_0.2-20 fastmatch_1.1-4 leiden_0.4.3.1 glue_1.7.0 vctrs_0.6.5
[71] png_0.1-8 cellranger_1.1.0 gtable_0.3.5 kernlab_0.9-32 cachem_1.0.8
[76] S4Arrays_1.2.1 mime_0.12 tidygraph_1.3.1 coda_0.19-4.1 survival_3.6-4
[81] pheatmap_1.0.12 iterators_1.0.14 shinythemes_1.2.0 fitdistrplus_1.1-11 ROCR_1.0-11
[86] nlme_3.1-164 RcppAnnoy_0.0.22 evd_2.3-7 UpSetR_1.4.0 irlba_2.3.5.1
[91] KernSmooth_2.23-22 colorspace_2.1-0 nnet_7.3-19 phangorn_2.11.1 tidyselect_1.2.1
[96] compiler_4.3.0 BiocNeighbors_1.18.0 SparseM_1.81 ggdendro_0.2.0 DelayedArray_0.26.7
[101] plotly_4.10.4 scales_1.3.0 DEoptimR_1.1-3 lmtest_0.9-40 quadprog_1.5-8
[106] NMF_0.27 digest_0.6.35 goftest_1.2-3 spatstat.utils_3.0-4 XVector_0.40.0
[111] htmltools_0.5.8.1 pkgconfig_2.0.3 fastmap_1.1.1 rlang_1.1.3 GlobalOptions_0.1.2
[116] htmlwidgets_1.6.4 shiny_1.8.1.1 farver_2.1.1 zoo_1.8-12 jsonlite_1.8.8
[121] BiocParallel_1.34.2 mclust_6.1.1 statnet.common_4.9.0 RCurl_1.98-1.14 rlist_0.4.6.2
[126] magrittr_2.0.3 modeltools_0.2-23 GenomeInfoDbData_1.2.10 ggnetwork_0.5.13 evmix_2.12
[131] munsell_0.5.1 Rcpp_1.0.12 ape_5.8 viridis_0.6.5 reticulate_1.36.1
[136] truncdist_1.0-2 stringi_1.8.4 ggalluvial_0.12.5 ggraph_2.2.1 zlibbioc_1.46.0
[141] MASS_7.3-60 plyr_1.8.9 flexmix_2.3-19 ggseqlogo_0.2 parallel_4.3.0
[146] listenv_0.9.1 ggrepel_0.9.5 bigmemory.sri_0.1.8 deldir_2.0-4 graphlayouts_1.1.1
[151] splines_4.3.0 hash_2.2.6.3 tensor_1.5 hms_1.1.3 circlize_0.4.16
[156] igraph_2.0.3 ggpubr_0.6.0 uuid_1.2-0 cubature_2.1.0 spatstat.geom_3.2-9
[161] ggsignif_0.6.4 rngtools_1.5.2 reshape2_1.4.4 BiocManager_1.30.23 tzdb_0.4.0
[166] foreach_1.5.2 tweenr_2.0.3 httpuv_1.6.15 MatrixModels_0.5-3 RANN_2.6.1
[171] polyclip_1.10-6 future_1.33.2 clue_0.3-65 scattermore_1.2 gridBase_0.4-7
[176] ggforce_0.4.2 broom_1.0.5 xtable_1.8-4 RSpectra_0.16-1 rstatix_0.7.2
[181] later_1.3.2 viridisLite_0.4.2 class_7.3-22 gsl_2.1-8 memoise_2.0.1
[186] registry_0.5-1 cluster_2.1.6 timechange_0.3.0 globals_0.16.3
TIL_4095_101.csv.csv TIL_4095_201.csv.csv TIL_4095_301.csv.csv TIL_4095_401.csv.csv
@yueli8
Thank you so much for the excellent run down. I was able to reproduce the issue and fix it - I have slightly modified the code below - you can just your lapply(filelist, read.delim)
and then use loadContigs()
.
#four samples# Step1: Load MIXCR output -----------------------------------------
TIL_4095_1 <- "TIL_4095_101.csv"
TIL_4095_2 <- "TIL_4095_201.csv"
TIL_4095_3 <- "TIL_4095_301.csv"
TIL_4095_4 <- "TIL_4095_401.csv"
filelist <- c(TIL_4095_1,TIL_4095_2,TIL_4095_3,TIL_4095_4)
samples <- c("S1","S2","S3","S4")
contig_list <- lapply(filelist, read.delim)
contig.list <- loadContigs(contig_list, format = "MiXCR")
#combineTCR
combined.TCR <- combineTCR(contig.list,
samples = samples,
removeNA = FALSE,
removeMulti = FALSE,
filterMulti = FALSE)
positionalEntropy(combined.TCR,
chain = "TRB",
aa.length = 20)
#positionalProperty
positionalProperty(combined.TCR,
chain = "TRB",
aa.length = 20,
method = "Atchley") +
scale_color_manual(values = hcl.colors(5, "inferno"))
The issue was that positionalProperty()
expected all amino acids to be present. Across your 4 samples, there are several where not all 20 amino acids are present, so the error occured.
I have commited the fix to the dev branch of the repo and I will test it out and pull it into the master branch in the coming weeks.
Thanks again for following up, Nick
Hello Nike,
Thank you so much for your detail explanation!
Unfortunately, it does not work on my system.
Thank you again and really appreciated!
Best,
Yue
rm(list=ls())
library(immunarch)
library(scRepertoire)
library(tidyverse)
library(SingleCellExperiment)
library(Seurat)
library(powerTCR)
TIL_4095_1 <- "TIL_4095_101.csv"
TIL_4095_2 <- "TIL_4095_201.csv"
TIL_4095_3 <- "TIL_4095_301.csv"
TIL_4095_4 <- "TIL_4095_401.csv"
filelist <- c(TIL_4095_1,TIL_4095_2,TIL_4095_3,TIL_4095_4)
samples <- c("S1","S2","S3","S4")
contig_list <- lapply(filelist, read.delim)
contig.list <- loadContigs(contig_list, format = "MiXCR")
#combineTCR
combined.TCR <- combineTCR(contig.list,
samples = samples,
removeNA = FALSE,
removeMulti = FALSE,
filterMulti = FALSE)
positionalEntropy(combined.TCR,
chain = "TRB",
aa.length = 20)
Warning messages:
1: In `[<-.factor`(`*tmp*`, thisvar, value = 0) :
invalid factor level, NA generated
2: In `[<-.factor`(`*tmp*`, thisvar, value = 0) :
invalid factor level, NA generated
3: In `[<-.factor`(`*tmp*`, thisvar, value = 0) :
invalid factor level, NA generated
4: In `[<-.factor`(`*tmp*`, thisvar, value = 0) :
invalid factor level, NA generated
#positionalProperty
positionalProperty(combined.TCR,
chain = "TRB",
aa.length = 20,
method = "Atchley") +
scale_color_manual(values = hcl.colors(5, "inferno"))
Error in propertyFunc[char, ] : subscript out of bounds
In addition: Warning messages:
1: In `[<-.factor`(`*tmp*`, thisvar, value = 0) :
invalid factor level, NA generated
2: In `[<-.factor`(`*tmp*`, thisvar, value = 0) :
invalid factor level, NA generated
3: In `[<-.factor`(`*tmp*`, thisvar, value = 0) :
invalid factor level, NA generated
4: In `[<-.factor`(`*tmp*`, thisvar, value = 0) :
invalid factor level, NA generated
> sessionInfo()
R version 4.3.0 (2023-04-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.6 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=zh_CN.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=zh_CN.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=zh_CN.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=zh_CN.UTF-8 LC_IDENTIFICATION=C
time zone: Asia/Shanghai
tzcode source: system (glibc)
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] powerTCR_1.20.0 SeuratObject_4.1.4 Seurat_4.4.0 SingleCellExperiment_1.22.0
[5] SummarizedExperiment_1.30.2 Biobase_2.60.0 GenomicRanges_1.52.1 GenomeInfoDb_1.36.4
[9] IRanges_2.36.0 S4Vectors_0.40.2 BiocGenerics_0.48.1 MatrixGenerics_1.12.3
[13] matrixStats_1.3.0 lubridate_1.9.3 forcats_1.0.0 stringr_1.5.1
[17] purrr_1.0.2 readr_2.1.5 tidyr_1.3.1 tibble_3.2.1
[21] tidyverse_2.0.0 scRepertoire_2.0.3 immunarch_0.9.1 patchwork_1.2.0
[25] data.table_1.15.4 dtplyr_1.3.1 dplyr_1.1.4 ggplot2_3.5.1
loaded via a namespace (and not attached):
[1] spatstat.sparse_3.0-3 bitops_1.0-7 httr_1.4.7 RColorBrewer_1.1-3 doParallel_1.0.17
[6] prabclus_2.3-3 tools_4.3.0 sctransform_0.4.1 backports_1.4.1 vegan_2.6-4
[11] utf8_1.2.4 R6_2.5.1 mgcv_1.9-1 lazyeval_0.2.2 uwot_0.2.2
[16] permute_0.9-7 withr_3.0.0 sp_2.1-4 gridExtra_2.3 progressr_0.14.0
[21] quantreg_5.97 cli_3.6.2 factoextra_1.0.7 spatstat.explore_3.2-7 iNEXT_3.0.1
[26] labeling_0.4.3 diptest_0.77-1 robustbase_0.99-2 spatstat.data_3.0-4 ggridges_0.5.6
[31] pbapply_1.7-2 stringdist_0.9.12 parallelly_1.37.1 readxl_1.4.3 VGAM_1.1-10
[36] rstudioapi_0.16.0 generics_0.1.3 shape_1.4.6.1 ica_1.0-3 spatstat.random_3.2-3
[41] car_3.1-2 Matrix_1.6-1.1 fansi_1.0.6 abind_1.4-5 lifecycle_1.0.4
[46] carData_3.0-5 Rtsne_0.17 grid_4.3.0 promises_1.3.0 crayon_1.5.2
[51] miniUI_0.1.1.1 lattice_0.22-6 cowplot_1.1.3 pillar_1.9.0 rjson_0.2.21
[56] fpc_2.2-12 future.apply_1.11.2 codetools_0.2-20 fastmatch_1.1-4 leiden_0.4.3.1
[61] glue_1.7.0 vctrs_0.6.5 png_0.1-8 cellranger_1.1.0 gtable_0.3.5
[66] kernlab_0.9-32 cachem_1.0.8 S4Arrays_1.2.1 mime_0.12 tidygraph_1.3.1
[71] survival_3.6-4 pheatmap_1.0.12 iterators_1.0.14 shinythemes_1.2.0 fitdistrplus_1.1-11
[76] ROCR_1.0-11 nlme_3.1-164 RcppAnnoy_0.0.22 evd_2.3-7 UpSetR_1.4.0
[81] irlba_2.3.5.1 KernSmooth_2.23-22 colorspace_2.1-0 nnet_7.3-19 phangorn_2.11.1
[86] tidyselect_1.2.1 compiler_4.3.0 SparseM_1.81 ggdendro_0.2.0 DelayedArray_0.26.7
[91] plotly_4.10.4 scales_1.3.0 DEoptimR_1.1-3 lmtest_0.9-40 quadprog_1.5-8
[96] digest_0.6.35 goftest_1.2-3 spatstat.utils_3.0-4 XVector_0.40.0 htmltools_0.5.8.1
[101] pkgconfig_2.0.3 fastmap_1.1.1 rlang_1.1.3 GlobalOptions_0.1.2 htmlwidgets_1.6.4
[106] shiny_1.8.1.1 farver_2.1.1 zoo_1.8-12 jsonlite_1.8.8 mclust_6.1.1
[111] RCurl_1.98-1.14 rlist_0.4.6.2 magrittr_2.0.3 modeltools_0.2-23 GenomeInfoDbData_1.2.10
[116] munsell_0.5.1 Rcpp_1.0.12 evmix_2.12 ape_5.8 viridis_0.6.5
[121] reticulate_1.36.1 truncdist_1.0-2 stringi_1.8.4 ggalluvial_0.12.5 ggraph_2.2.1
[126] zlibbioc_1.46.0 MASS_7.3-60 plyr_1.8.9 flexmix_2.3-19 ggseqlogo_0.2
[131] parallel_4.3.0 listenv_0.9.1 ggrepel_0.9.5 deldir_2.0-4 graphlayouts_1.1.1
[136] splines_4.3.0 hash_2.2.6.3 tensor_1.5 hms_1.1.3 circlize_0.4.16
[141] igraph_2.0.3 ggpubr_0.6.0 uuid_1.2-0 spatstat.geom_3.2-9 cubature_2.1.0
[146] ggsignif_0.6.4 reshape2_1.4.4 tzdb_0.4.0 foreach_1.5.2 tweenr_2.0.3
[151] httpuv_1.6.15 MatrixModels_0.5-3 RANN_2.6.1 polyclip_1.10-6 future_1.33.2
[156] scattermore_1.2 ggforce_0.4.2 broom_1.0.5 xtable_1.8-4 rstatix_0.7.2
[161] later_1.3.2 viridisLite_0.4.2 class_7.3-22 gsl_2.1-8 memoise_2.0.1
[166] cluster_2.1.6 timechange_0.3.0 globals_0.16.3
Hello ncborcherding,
I worked with
scRepertoire
on version2.0.3
. It works well onpositionalEntropy
, but not onpositionalProperty
.Thank you in advance for great help!
Best,
Yue