Closed beginner984 closed 2 years ago
Hey Beginner984,
Thanks for reaching out - would you mind giving me the version of scRepertoire you are using?
Also what is the structure of your contig_list? How many samples, etc?
I ask because your quantContig(combined, cloneCall="gene+nt", scale = T)
should not have NA samples unless there might be 3 contigs?
What is the output of:
lapply(contig_list, dim)
lapply(combined, dim)
In terms of the issue you are posting:
> combined <- combineTCR(contig_list,
+ samples = c("p1", "p2"),removeNA=TRUE)
> quantContig(combined, cloneCall="gene+nt", scale = T)
Error in df[[i]] : subscript out of bounds
That is an error in either reading the combinedTCR() in the quantContig()
function or from having an issue with removing all the data during filtering of NAs (this shouldn't happen as there should be a warning message)
Thank you so much for replying me
I have 5' GEX + V(J)D for two healthy PBMCs
I have attached my TCR and BCR for one sample here
> lapply(contig_list, dim)
[[1]]
[1] 5318 31
[[2]]
[1] 5773 31
> lapply(combined, dim)
$p1
[1] 2231 13
$p2
[1] 2306 13
>
> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 11.2.3
Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] sp_1.4-7 SeuratObject_4.1.0 Seurat_4.1.0 scRepertoire_1.5.3
[5] ggplot2_3.3.6
Thanks for any help
Hey Beginner984,
Everything looks correct there - I tested your code with the t.csv sample and functioned correctly with v 1.5.3.
What was your code to load and form the contig_list?
Thanks for bearing with me.
Nick
I should be thankful
S1 <- read.csv("/Users/angel/Downloads/p1/t.csv")
S2 <- read.csv("/Users/angel/Downloads/p2/t.csv")
contig_list <- list(S1, S2)
combined <- combineTCR(contig_list,samples = c("p1", "p2"),removeNA=TRUE)
Sorry I have made a Seurat object from GEX but I get error in combining that with TCR
> unique(pbmc@meta.data$sample)
[1] "PN0340_0001_control" "PN0340_0002_control"
> seurat <- combineExpression(combined, pbmc,
+ cloneCall="gene", group.by = "sample", proportion = FALSE,
+ cloneTypes=c(Single=1, Small=5, Medium=20, Large=100, Hyperexpanded=500))
Error in data.frame(sc[[]], slot(sc, "active.ident")) :
arguments imply differing number of rows: 11977, 10825
>
> pbmc
An object of class Seurat
19381 features across 10825 samples within 1 assay
Active assay: RNA (19381 features, 2000 variable features)
2 dimensional reductions calculated: pca, umap
>
Thank you any idea
Hey Beginner984,
S1 <- read.csv("/Users/angel/Downloads/p1/t.csv") S2 <- read.csv("/Users/angel/Downloads/p2/t.csv") contig_list <- list(S1, S2) combined <- combineTCR(contig_list,samples = c("p1", "p2"),removeNA=TRUE)
Would you mind emailing me you p2 t.csv sample - I would like to match exactly what you are doing in order to debug.
Error in data.frame(sc[[]], slot(sc, "active.ident")) : arguments imply differing number of rows: 11977, 10825
Sorry this seems like a very strange error message stemming from pulling the meta data from the single-cell object. What it is saying is the number of rows (or number of cells) in seurat@meta.data
does not match seurat@active.ident
for some reason - are there NAs in the active Identity that you are currently using?
Nick
Sorry I removed NAs in Seurat metadata but this time I get this error
> combineExpression(combined, pbmc,
+ cloneCall="gene", group.by = "sample", proportion = FALSE,
+ cloneTypes=c(Single=1, Small=5, Medium=20, Large=100, Hyperexpanded=500))
Error in `$<-.data.frame`(`*tmp*`, "cloneType", value = NA) :
replacement has 1 row, data has 0
>
This is the link to my full R image including Seurat object and S1, S2, combined
https://www.dropbox.com/s/qv9b1djonv9fzs2/pbmc.RData?dl=0
Thanks for any help
Hey Beginner984,
Thanks for providing everything!! I really appreciate that.
I am not sure what is going on when it comes to the in quantContig issue (specifically how you got a NA column), here is what I get with your code:
S1 <- read.csv("p1.csv")
S2 <- read.csv("p2.csv")
contig.list <- list(S1, S2)
combined <- combineTCR(contig.list,samples = c("p1", "p2"),removeNA=TRUE)
quantContig(combined, cloneCall="gene+nt", scale = TRUE)
One thing I have noticed with users is sometimes when using contig_list, there might be issues with overwriting your data with the built-in scRepertoire data. I would suggest probably avoiding storing your data as contig_list - here is what happens if I take your code and accidentally use the built-in data:
combined <- combineTCR(contig_list,samples = c("p1", "p2"),removeNA=TRUE)
quantContig(combined, cloneCall="gene+nt", scale = TRUE)
Error in df[[i]] : subscript out of bounds
In terms of the error during the combining of the Seurat and TCR data:
combineExpression(combined, pbmc,
cloneCall="gene", group.by = "sample", proportion = FALSE,
cloneTypes=c(Single=1, Small=5, Medium=20, Large=100, Hyperexpanded=500))
Error in
$<-.data.frame
(*tmp*
, "cloneType", value = NA) : replacement has 1 row, data has 0
This is due to issues with matching the cell barcodes between the Seurat Object and the combined file.
contig.list <- list(S1, S2)
combined <- combineTCR(contig.list,samples = c("p1", "p2"),removeNA=TRUE)
combined$p1$barcode[1:10] #This is looking at the barcodes in the first combinedTCR sample
[1] "p1_AAACCTGCAGGACGTA-1" "p1_AAACCTGGTACGACCC-1" "p1_AAACCTGGTAGAGTGC-1" "p1_AAACCTGGTATGCTTG-1" "p1_AAACCTGGTCTAGCGC-1" [6] "p1_AAACCTGGTGGTACAG-1" "p1_AAACCTGTCCTAAGTG-1" "p1_AAACCTGTCCTACAGA-1" "p1_AAACCTGTCTGTTTGT-1" "p1_AAACGGGAGTTGTAGA-1"
rownames(pbmc[[]])[1:10] #Barcodes in the Seurat Object
"AAACCTGCAACAACCT-1" "AAACCTGCAAGCCGTC-1" "AAACCTGCAGGACGTA-1" "AAACCTGGTACGACCC-1" "AAACCTGGTAGAGTGC-1" [6] "AAACCTGGTATGCTTG-1" "AAACCTGGTCTAGCGC-1" "AAACCTGGTGGTACAG-1" "AAACCTGTCAAAGTAG-1" "AAACCTGTCCTAAGTG-1"
Thank you so much
When I matched the barcodes in Seurat and combined I get this error
> DimPlot(pbmc, reduction = "umap")
Error: Must request at least one colour from a hue palette.
>
Hey Beginner984,
Glad you got the barcode match figured out.
In terms of your error - that is an issue from Seurat itself. There are several similar issues reported on their github repo (see here, which might help you diagnose and solve the issue.
Hope that helps and thanks again for being so forthcoming with the data sharing, it really quickens the troubleshooting and I appreciate it!
Nick
Thank you so much, I had two healthy PBMC using 10x (5' + TCR +BCR)
I have analyzed my data like this
Looking at these plots are you seeing any abnormality in my data please?
My main concern is if I have a good TCR results but by these plots I am nor sure how I could know if my data is good or bad
Thanks for any point
Hey the data itself looks good - there are always going to be TCRs recovered in non-Tcell clusters (I've noticed it, particularly in mono/macrophage cells). Your data looks well integrated. My one piece of advice would be to check the small cell clusters that are somewhat independent for doublets and mitochondria reads -
I've noticed in my work these small clusters have a tendency to contain artifact. That being said, they may also be rare cell types.
Nick
Hello @ncborcherding
I have analysed my TCR + 5' GEX with your software
I got a few questions please
In the below plot, I have two samples s1 and s2 but what does the unique clone in Y axis means? For instance TRBV18 is common between two samples?
Hey beginner984,
I am unfamiliar with what you are plotting? How did you get the graph from scRepertoire?
Nick
Sorry I had attached a wrong image
In Y axis I could not figure out what unique clones means?
Please can you help me with that?
Thank you so much in advance
Hey beginner984,
The y-axis in this instance is:
(number of unique clonotypes / total number of cells with TCR data)*100
So in essence it is the percent of unique clonotypes across the whole sample repertoire.
Nick
Thank you much
In this figure
Thank you so much for answering me
I see for a given TRB some times we have more than one TRA (I do not know if biologically this makes sense ) like
with
and
How I know which of these two TRA is an actual pair? and which one is an artefact?
The presence of 2 alpha chains in the field of TCR is controversial, but in a given pathological state/model there seems to range from 5-20% dual-TCRA. Some of these likely code for functional dual alphas (this is a pretty good review on the topic). Having looked at a lot of 10x data, there are also up to ~5% dual TCRB - this does not make sense to me due to allelic exclusion and I am not sure if anyone has convincingly shown dual TCRB.
If you don't like the dual TCRA, you can filter for the highest expressing chain using:
combineTCR(..., filterMulti = TRUE)
Thank you very much
How one can know if the CDR3 sequences in the above plot are novel or known?
That is going to vary - the best resource is probably is the AIRR-Seq group which has a giant repository of sequences. Another great resource is VDJdb - however there is no centralized database so determining novelty is difficult.
Thanks a lot to be this helpful
Might be out of questioning regarding your software, but I need your thoughts please
With a Library Preparation Kit : 10X Genomics Next GEM Single Cell 5' v2 (Dual Index) targeting 8000 cells per PBMC sample and when in total for GEX:TCR:BCR, we would need 300M read pairs per sample.
Please can you give me a clue how likely we can capture a T cell clone? Rare or expanded
Thanks in advance for any guidance
Hey beginner984,
This is going to vary by disease process or model, with clonal stimulus being the key. I am in the tumor field and that estimate sounds correct for my interests - but I am not focused on rare clones. I've had discussions with people in the neuro autoimmune field and they feel that they need to target 1e5-1e6 T cells to detect possibly pathogenic clones (via the chromium X system). You might check the literature in the field and see what the level of targeting has been.
Thank you very much
That was extremely helpful
I am in cancer sciences too, comparing people who remain cancer free for years after treatment and people who can live with cancer for long time
Hi
I have PBMC 5' GEX
from 10x on which I run TRUST4
I have attached my TRUST4 outputs here
I got error by running your software
Any help please?
> S1 <- read.csv("t_0001_GEX_S3_L001_R2_001_barcode_report.csv")
> S2 <- read.csv("t_0002_GEX_S3_L001_R2_001_barcode_report.csv")
> contig_list <- list(S1, S2)
> combined <- combineTCR(contig_list,samples = c("p1", "p2"),removeNA=TRUE)
Error in `mutate()`:
! Problem while computing `TCR1 = ifelse(...)`.
Caused by error in `ifelse()`:
! object 'chain' not found
Run `rlang::last_error()` to see where the error occurred.
Thanks for any help
You'll need to either use combineTRUST4()
to generate a list of clonotype by barcode or check out loadContigs()
in the dev version to make the TRUST4 output compatible with combineTCR()
Thanks a lot I got this error
> combined <- combineTRUST4(contig_list,samples = c("p1", "p2"),removeNA=TRUE)
Error in checkList(df) : could not find function "checkList"
>
What version of scRepertoire are you currently using? What is the output of sessionInfo()
?
Thanks a lot
> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 11.2.3
Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] scRepertoire_1.7.2 ggplot2_3.3.6
loaded via a namespace (and not attached):
[1] cubature_2.0.4.4 VGAM_1.1-7
[3] colorspace_2.0-3 ellipsis_0.3.2
[5] rprojroot_2.0.3 evd_2.3-6.1
[7] XVector_0.34.0 GenomicRanges_1.46.1
[9] fs_1.5.2 rstudioapi_0.13
[11] listenv_0.8.0 farver_2.1.1
[13] remotes_2.4.2 graphlayouts_0.8.0
[15] gsl_2.1-7.1 ggrepel_0.9.1
[17] fansi_1.0.3 codetools_0.2-18
[19] splines_4.1.2 doParallel_1.0.17
[21] cachem_1.0.6 polyclip_1.10-0
[23] pkgload_1.3.0 evmix_2.12
[25] cluster_2.1.3 rgeos_0.5-9
[27] ggforce_0.3.3 compiler_4.1.2
[29] assertthat_0.2.1 SeuratObject_4.1.0
[31] Matrix_1.4-1 fastmap_1.1.0
[33] cli_3.3.0 tweenr_1.0.2
[35] prettyunits_1.1.1 tools_4.1.2
[37] igraph_1.3.2 gtable_0.3.0
[39] glue_1.6.2 GenomeInfoDbData_1.2.7
[41] reshape2_1.4.4 dplyr_1.0.9
[43] Rcpp_1.0.8.3 Biobase_2.54.0
[45] vctrs_0.4.1 nlme_3.1-158
[47] progressr_0.10.1 iterators_1.0.14
[49] ggraph_2.0.6 ggalluvial_0.12.3
[51] stringr_1.4.0 globals_0.15.1
[53] ps_1.7.1 lifecycle_1.0.1
[55] devtools_2.4.3 stringdist_0.9.8
[57] future_1.26.1 zlibbioc_1.40.0
[59] MASS_7.3-57 scales_1.2.0
[61] tidygraph_1.2.1 MatrixGenerics_1.6.0
[63] parallel_4.1.2 SummarizedExperiment_1.24.0
[65] SparseM_1.81 curl_4.3.2
[67] SingleCellExperiment_1.16.0 memoise_2.0.1
[69] gridExtra_2.3 truncdist_1.0-2
[71] stringi_1.7.6 S4Vectors_0.32.4
[73] foreach_1.5.2 permute_0.9-7
[75] BiocGenerics_0.40.0 pkgbuild_1.3.1
[77] GenomeInfoDb_1.30.1 rlang_1.0.3
[79] pkgconfig_2.0.3 matrixStats_0.62.0
[81] bitops_1.0-7 lattice_0.20-45
[83] purrr_0.3.4 tidyselect_1.1.2
[85] processx_3.6.1 parallelly_1.32.0
[87] plyr_1.8.7 magrittr_2.0.3
[89] R6_2.5.1 IRanges_2.28.0
[91] generics_0.1.3 DelayedArray_0.20.0
[93] DBI_1.1.3 powerTCR_1.14.0
[95] pillar_1.7.0 withr_2.5.0
[97] mgcv_1.8-40 RCurl_1.98-1.7
[99] sp_1.5-0 tibble_3.1.7
[101] future.apply_1.9.0 crayon_1.5.1
[103] utf8_1.2.2 viridis_0.6.2
[105] usethis_2.1.6 grid_4.1.2
[107] callr_3.7.0 vegan_2.6-2
[109] digest_0.6.29 tidyr_1.2.0
[111] stats4_4.1.2 munsell_0.5.0
[113] viridisLite_0.4.0 sessioninfo_1.2.2
Hey beginner984,
Both the dev and master branch have the checkList function, so I am not sure how you are getting the error? Can you show me the code you are using to load the contigs (I am assuming with loadContigs()
because you are using combineTCR()
)
Thanks Nick
Hey! I wonder why is my clonotype slot empty after integration?
When I simply try to merge using barcode in my seurat object which are barcodes from the combined TCR, I get this error:
Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : arguments imply differing number of rows: 971, 7901
Hey @apal6,
Could you give me your:
1) sessionInfo() 2) What specific code you are getting the error call from. 3) Any additional information on the workflow to create the combineTCR() object?
Thanks, Nick
Hey Nick,
I was able to figure out my error. Thank you. However, when I am trying to plot the two conditions that I have by defining the group.by = "group" which have my two timepoints, I get this plot with NA. Not sure why my other time point is missing from the plot.
Code: slot(seurat, "meta.data")$cloneType <- factor(slot(seurat, "meta.data")$cloneType, levels = c("Hyperexpanded (100 < X <= 500)", "Large (20 < X <= 100)", "Medium (5 < X <= 20)", "Small (1 < X <= 5)", "Single (0 < X <= 1)", NA)) DimPlot(seurat, group.by = "group") + scale_color_manual(values = colorblind_vector(5), na.value="grey") + theme(plot.title = element_blank())
unique(seurat@meta.data$group) [1] "Day_16" "Infusion"
I do have several NA's in my object.
Thanks, Aastha
Hey Asstha,
Good to hear that you figured out the problem. Interesting problem with the NA values on DimPlot()
. Have you tried to re-organize the levels of the variable group? (like the cloneType code above?) Or added the group variable using another source? The meta data will amend matching columns when using AddMetaData()
or combineExpression()
What is the output of:
table(seurat@meta.data$group, useNA = "ifany")
I do not think this is a scRepertoire issue per say - but happy to help debug if you are interested. If you wouldn't mind emailing me your code or we can troubleshoot via email to keep the issue area cleared up.
Thanks, Nick
Sure, I will email.
Code: table(seurat@meta.data$group, useNA = "ifany")
output:
`Day_16` Infusion
1287 8574
Thanks, Aastha
Hey Nick,
I have emailed my script. I also just realized that my cloneType is entirely empty. Not sure why though
Sorry @ncborcherding how I can tackle this problem please
> seurat <- combineExpression(list.receptors,
+ seurat,
+ cloneCall="gene",
+ proportion = TRUE)
Warning message:
In combineExpression(list.receptors, seurat, cloneCall = "gene", :
< 1% of barcodes match: Ensure the barcodes in
the Seurat object match the
barcodes in the combined immune receptor list from
scRepertoire - most common issue is the addition of the
prefixes corresponding to `samples` and 'ID' in the combineTCR/BCR()
functions
>
TCR$HC1_C$barcode
[1] "HC1_C_AAACCTGCAGGACGTA-1" "HC1_C_AAACCTGGTACGACCC-1" "HC1_C_AAACCTGGTAGAGTGC-1" "HC1_C_AAACCTGGTATGCTTG-1"
> rownames(seurat@meta.data)
[1] "AAACCTGAGACAGGCT-1_1" "AAACCTGAGAGACTAT-1_1" "AAACCTGAGCCCAGCT-1_1" "AAACCTGAGTACGTTC-1_1" "AAACCTGCAAGTACCT-1_1"
Thanks for any help
Good rundown of the solution is #69
Hi Likely I have a lot of NAs in my TCR
quantContig(combined, cloneCall="gene+nt", scale = T)
When I try to remove my NAs I get this error
Thank you for any idea