Closed MaryGoAround closed 2 years ago
Hey MaryGoAround,
Thanks for reaching out.
Did highlightClonotypes()
make a column in your metadata with the selected clone? Also what version of scRepertoire are you using?
In terms of your error - that is an issue from Seurat itself. There are several similar issues reported on their github repo (see here, which might help you diagnose and solve the issue.
Nick
Thank you I get this
> highlightClonotypes(seurat)
An object of class Seurat
19381 features across 10861 samples within 1 assay
Active assay: RNA (19381 features, 2000 variable features)
2 dimensional reductions calculated: pca, umap
Warning messages:
1: In if (x %in% c("CTnt", "CTgene", "CTaa", "CTstrict")) { :
the condition has length > 1 and only the first element will be used
2: In if (x == "gene" | x == "genes") { :
the condition has length > 1 and only the first element will be used
What does the following look like:
head(seurat@meta.data)
scRepertoire_1.5.3
> head(seurat@meta.data)
sample orig.ident nCount_RNA nFeature_RNA percent.mt RNA_snn_res.0.5 seurat_clusters RNA_snn_res.0.4 barcode
AAACCTGCAACAACCT-1 p1 pbmc3k 1113 575 3.593890 4 4 4 <NA>
AAACCTGCAAGCCGTC-1 p1 pbmc3k 7510 2433 6.631158 1 1 1 <NA>
AAACCTGCAGGACGTA-1 p1 pbmc3k 2461 876 5.891914 0 0 0 AAACCTGCAGGACGTA-1
AAACCTGGTACGACCC-1 p1 pbmc3k 2284 968 4.290718 0 0 0 AAACCTGGTACGACCC-1
AAACCTGGTAGAGTGC-1 p1 pbmc3k 2962 1091 4.017556 2 2 2 AAACCTGGTAGAGTGC-1
AAACCTGGTATGCTTG-1 p1 pbmc3k 2853 1094 2.804066 0 0 0 AAACCTGGTATGCTTG-1
CTgene
AAACCTGCAACAACCT-1 <NA>
AAACCTGCAAGCCGTC-1 <NA>
AAACCTGCAGGACGTA-1 TRAV13-1.TRAJ48.TRAC_TRBV5-1.NA.TRBJ1-5.TRBC1
AAACCTGGTACGACCC-1 TRAV35.TRAJ58.TRAC_TRBV5-5.NA.TRBJ2-7.TRBC2
AAACCTGGTAGAGTGC-1 TRAV30.TRAJ45.TRAC_TRBV5-1.NA.TRBJ2-1.TRBC2
AAACCTGGTATGCTTG-1 TRAV12-1.TRAJ43.TRAC;TRAV8-4.TRAJ10.TRAC_TRBV4-2.NA.TRBJ2-3.TRBC2
CTnt
AAACCTGCAACAACCT-1 <NA>
AAACCTGCAAGCCGTC-1 <NA>
AAACCTGCAGGACGTA-1 TGTGCAGCTCCTTACGTGGGGTTTGGAAATGAGAAATTAACCTTT_TGCGCCAGCAGCTTAGCCGGTATCAATCAGCCCCAGCATTTT
AAACCTGGTACGACCC-1 TGTGCTGGGCAGCCTTCAGGAGCCAGTGGCTCTAGGTTGACCTTT_TGTGCCAGCAGCTTAACAGGGAAGTCCTACGAGCAGTACTTC
AAACCTGGTAGAGTGC-1 TGCGGCACAGCTCCAGGAGGAGGTGCTGACGGACTCACCTTT_TGCGCCAGCAGCGGTGGGGGTTCCTACAATGAGCAGTTCTTC
AAACCTGGTATGCTTG-1 TGTGTGGTGATGCCCTACAATAACAATGACATGCGCTTT;TGTGCTGTGAGTACATTCACGGGAGGAGGAAACAAACTCACCTTT_TGTGCCAGCAGCGTAGGGCGGGGGGACACAGATACGCAGTATTTT
CTaa
AAACCTGCAACAACCT-1 <NA>
AAACCTGCAAGCCGTC-1 <NA>
AAACCTGCAGGACGTA-1 CAAPYVGFGNEKLTF_CASSLAGINQPQHF
AAACCTGGTACGACCC-1 CAGQPSGASGSRLTF_CASSLTGKSYEQYF
AAACCTGGTAGAGTGC-1 CGTAPGGGADGLTF_CASSGGGSYNEQFF
AAACCTGGTATGCTTG-1 CVVMPYNNNDMRF;CAVSTFTGGGNKLTF_CASSVGRGDTDTQYF
CTstrict
AAACCTGCAACAACCT-1 <NA>
AAACCTGCAAGCCGTC-1 <NA>
AAACCTGCAGGACGTA-1 TRAV13-1.TRAJ48.TRAC_TGTGCAGCTCCTTACGTGGGGTTTGGAAATGAGAAATTAACCTTT_TRBV5-1.NA.TRBJ1-5.TRBC1_TGCGCCAGCAGCTTAGCCGGTATCAATCAGCCCCAGCATTTT
AAACCTGGTACGACCC-1 TRAV35.TRAJ58.TRAC_TGTGCTGGGCAGCCTTCAGGAGCCAGTGGCTCTAGGTTGACCTTT_TRBV5-5.NA.TRBJ2-7.TRBC2_TGTGCCAGCAGCTTAACAGGGAAGTCCTACGAGCAGTACTTC
AAACCTGGTAGAGTGC-1 TRAV30.TRAJ45.TRAC_TGCGGCACAGCTCCAGGAGGAGGTGCTGACGGACTCACCTTT_TRBV5-1.NA.TRBJ2-1.TRBC2_TGCGCCAGCAGCGGTGGGGGTTCCTACAATGAGCAGTTCTTC
AAACCTGGTATGCTTG-1 TRAV12-1.TRAJ43.TRAC;TRAV8-4.TRAJ10.TRAC_TGTGTGGTGATGCCCTACAATAACAATGACATGCGCTTT;TGTGCTGTGAGTACATTCACGGGAGGAGGAAACAAACTCACCTTT_TRBV4-2.NA.TRBJ2-3.TRBC2_TGTGCCAGCAGCGTAGGGCGGGGGGACACAGATACGCAGTATTTT
Frequency cloneType highlight
AAACCTGCAACAACCT-1 NA <NA> NA
AAACCTGCAAGCCGTC-1 NA <NA> NA
AAACCTGCAGGACGTA-1 0.0004608295 <NA> NA
AAACCTGGTACGACCC-1 0.0004608295 <NA> NA
AAACCTGGTAGAGTGC-1 0.0004608295 <NA> NA
AAACCTGGTATGCTTG-1 0.0004608295 <NA> NA
>
For instance I tried this clone and I got this plot
seurat <- highlightClonotypes(seurat,
cloneCall= "aa",
sequence = c("CAAPYVGFGNEKLTF_CASSLAGINQPQHF"))
DimPlot(seurat, group.by = "highlight") +
theme(plot.title = element_blank())
Hey MaryGoAround,
It looks like highlightClonotypes()
is adding the column to your seurat object. And in your test case with the code:
seurat <- highlightClonotypes(seurat,
cloneCall= "aa",
sequence = c("CAAPYVGFGNEKLTF_CASSLAGINQPQHF"))
DimPlot(seurat, group.by = "highlight") +
theme(plot.title = element_blank())
You are generating a DimPlot effectively - how many copies of the "CAAPYVGFGNEKLTF_CASSLAGINQPQHF" do you have in your data?
Nick
how many copies of the "CAAPYVGFGNEKLTF_CASSLAGINQPQHF" do you have in your data?
I do not know this
How I could discover this please?
My main goal is identifying main clones in my data
Apologies should have included some code to help out - I think this should work
length(which(seurat@meta.data$CTaa == "CAAPYVGFGNEKLTF_CASSLAGINQPQHF"))
Thank you so much
> length(which(seurat@meta.data$CTaa == "CAAPYVGFGNEKLTF_CASSLAGINQPQHF"))
[1] 1
>
Sorry, is there any piece of code to print a list of clones with the highest copy number?
Hey MaryGoAround,
Ok that makes sense - the DimPlot() is working but we are essentially looking for 1 red dot (which may be obscured by the grey dots)
The easiest way to look at relative frequency is to use the meta data of your Seurat object and look at the column "Frequency".
meta <- seurat@meta.data
Then you can just view the meta object in Rstudio
Thank you so much
So I should look at columns 11 and 13 , right? CTaa
and Frequency
?
> meta <- seurat@meta.data
> View(meta)
> colnames(meta)
[1] "sample" "orig.ident" "nCount_RNA" "nFeature_RNA" "percent.mt" "RNA_snn_res.0.5" "seurat_clusters"
[8] "RNA_snn_res.0.4" "barcode" "CTgene" "CTnt" "CTaa" "CTstrict" "Frequency"
[15] "cloneType" "highlight"
>
I have tried this for the frequency of 0.5 and I got two dots
But both belong to B cells highlighted in CD4 T and Monocyte clusters :(
seurat <- highlightClonotypes(seurat,
cloneCall= "aa",
sequence = c("CARSVDKTVGTGPSYYGMAVW_CQQLDTYPWTF","CARVSHFSAFDPW_CQSYDSSLSGPYVF"))
DimPlot(seurat, group.by = "highlight",pt.size = 1) +
theme(plot.title = element_blank())
Hey MaryGoAround,
Yes columns 11 and 13.
Um based on your results and the frequency you are quoting - I am thinking there is an issue in the number of TCRs in your seurat object. Default coimbineExpression()
will add frequency based on the proportion of the sample, so if you are getting 0.5 and plotting 2 TCRs, do you only have 2 TCRs in your seurat object? I am able to see a single, red dot in the most recent picture down in the lower right cluster.
Thank you
I just sorted frequency column and copied pasted CTaa with the frequency of 0.5 :(
I got worry about my TCR number
I have attached 10x report here
In the report I see frequencies of 0.71 :(
I do not know what is going on here
From 10x report when I tried two top clones by
seurat <- highlightClonotypes(seurat,
cloneCall= "aa",
sequence = c("CALAQDNYGQNFVF_CASRDYNEQFF","CAMREGGGGYSTLTF_CASSTSGQGDYEQYF"))
DimPlot(seurat, group.by = "highlight",pt.size = 1) +
theme(plot.title = element_blank())
I got this plot
The frequency of these two is almost 0.005069124
So I am not sure how 10x selected the tops (which criteria) and how your software does
Hey MaryGoAround,
So it does look like you are getting TCR attached to your seurat object. 0.005069 sounds like a reasonable value unless you are expecting a clonal or malignant process.
Proportion (stored under Frequency in the meta data of the seurat object) for scRepertoire is calculated using the number of each clone divided by the total size of the repertoire. The repertoire itself is defined by the group.by parameter in combineExpression()
, as a default, each sequencing run is used.
I am going to close this issue as we have gotten a little far from the original technical issue. Please feel free to keep posting or email me if you have further questions.
Thank you so much for bearing with me
I have attached cellranger report of ten top clones
For instance, the report says that there are 23 clones of CALAQDNYGQNFVF_CASRDYNEQFF
By your software I get this figure
compareClonotypes(combined, numbers = 10, samples = c("sample1", "sample2"),
cloneCall="aa", graph = "alluvial")
But by this plot CALAQDNYGQNFVF_CASRDYNEQFF
does not seem to be the most top clone anymore
Please can you give me a hand how I can have a list of truely top clones?
Thank you once more
The clone numbers after combining to a seurat object are going to be heavily dependent on the cells that passed the quality control steps for the gene expression. It is common to have a 5-15% difference between the TCR barcodes and Seurat object depending on your filtering criteria and recovery from the sequencing runs itself.
Sorry, I am right that scRepertoire can not visualize TRUST4 data comes from 10x BCR kit?
> S1 <- read.delim("BCR1.tsv")
> S2 <- read.delim("BCR2.tsv")
> contig_list <- list(S1, S2)
> combined <- combineTRUST4(contig_list,samples = c("bcr1","bcr2"),removeNA=TRUE)
> quantContig(combined, cloneCall="gene+nt", scale = TRUE)
bcr=read.delim("BCR1.tsv")
> head(bcr[1:2,1:4])
X.barcode cell_type chain1
1 ATGAGCCACACGATGT B *
2 CTACTTTGTTCAACCA B *
chain2
1 IGLV1-40*01,*,IGLJ3*02,*,TGCCAGTCCTTTGACAACAGCCTGAGTGGTCCGGTGTTT,CQSFDNSLSGPVF,1.00,ATGAGCCACACGATGT_28117,94.29,0
2 IGKV3-11*01,*,IGKJ3*01,*,TGTCAGCAGCGTAGCAACTGGTTCACTTTC,CQQRSNWFTF,1.97,CTACTTTGTTCAACCA_53325,100.00,0
>
Thanks for any idea
Hey MaryGoRound,
I think the first step troubleshooting is making sure things got loaded properly (thanks for the head()) call. It looks like your TRUST4 data is comma separated - try:
bcr=read.csv("BCR1.tsv")
Nick
Thanks a lot for answering me
Please have a look at what I have done so far
I am using tsv file
> bcr=read.delim("BCR1.tsv")
> S1 <- read.delim("BCR1.tsv")
> S2 <- read.delim("BCR2.tsv")
> contig_list <- list(S1, S2)
> combined <- combineTRUST4(contig_list,samples = c("bcr1","bcr2"),removeNA=TRUE)
Warning messages:
1: In if (cells == "T-AB") { :
the condition has length > 1 and only the first element will be used
2: In if (cells == "T-AB") { :
the condition has length > 1 and only the first element will be used
3: In if (cells == "B") { :
the condition has length > 1 and only the first element will be used
4: In if (cells == "T-AB") { :
the condition has length > 1 and only the first element will be used
5: In if (cells != "B") { :
the condition has length > 1 and only the first element will be used
6: In if (cells != "B") { :
the condition has length > 1 and only the first element will be used
7: In df[[i]]$cell_type == cells :
longer object length is not a multiple of shorter object length
8: In df[[i]]$cell_type == cells :
longer object length is not a multiple of shorter object length
> quantContig(combined, cloneCall="gene+nt", scale = TRUE)
>
Which gave me an empty plot again
Hey MaryGoAround,
I think you are loading the data incorrectly - instead of S1 <- read.delim("BCR1.tsv")
try S1 <- read.csv("BCR1.tsv")
Nick
Thank you very much
"BCR1.tsv" is the exact output of TRUST4
I tried with read.csv
but gives error
> S1 <- read.csv("BCR1.tsv")
Error in read.table(file = file, header = header, sep = sep, quote = quote, :
more columns than column names
> S2 <- read.csv("BCR2.tsv")
Error in read.table(file = file, header = header, sep = sep, quote = quote, :
more columns than column names
Hi
I want to highlight important clonotypes, but I do not know how to find the most important clonotype to highlight
I tried one randomely
I get this error
Error: Must request at least one colour from a hue palette.
How I can have a list of the most important clones to plot?