MarioniLab / GammaDeltaTcells2018

Code for the gamma delta T cell project with Maike De La Roche, Hung-Chang Chen and Celia Martinez
0 stars 0 forks source link

Identification of emerging clone(s) and their origin #5

Closed chenhc720 closed 6 years ago

chenhc720 commented 6 years ago

We would like to identify any clone(s) rising in each gamma delta T cell subsets (Vg1, Vg4_Th1, Vg4_Th17 and Vg6) of old mice. Also we need to trace where these emerging clones coming from? Are they rising from (i) the original Th1/Th17 pool by clonal expansion or (ii) from the opposite Th1/Th17 pool by re-polarisation in response to changed environmental cues, during aging?

Moreover, it is also important to figure out whether these emerging clones identified in old mice are (i) "private" for each individuals or (ii) "shared/conserved" between different individuals. The later indicates importantly the possible existence of an universal antigen for aging and the activation of specific gamma delta T cell clone against it!

Importantly, the analysis should be performed for both "TCRdelta" and "TCRgamma" repertoires of each cell subsets..

The PlotFancySpectratype function of VDJtools allows plotting the proportion of Top 20 most expanded clones against the length of CDR3 sequence (nucleotides).

nilseling commented 6 years ago

I now added a heatmap to the Figure folder that was generated as follows:

  1. I selected the top 10 enriched clones for each old library (using the TRG sequence first and separating V1 and V3 for the Vg1 samples)
  2. I merged these to a unique list
  3. I collected the clone fraction for each of these clones in each library (young and old)
  4. I plotted this in a heatmap
chenhc720 commented 6 years ago

looks great! But would it be possible to make some modifications? The heat map now is too big to understand and some information needed is missing.

  1. Please change the plot to include only Vg4_Th1 and Vg4_Th17. It doesn't make too much sense to me looking into whether Vg4 cells are derived from Vg1 cells. Can we do the same for Vg6 cells in a separate plot with all the sequences identified? The repertoire of Vg6 is small, generally smaller than 10 clones. This would tell us if there is any emerging clone coming out in the old mice.

  2. Please mark where the sequences derived from if possible (by colour codes maybe?). This information is crucial as it tells us whether the sequence is private or public in between different young or old individuals.

  3. Please plot as well according to TCRD. The diversity of TCRD is much higher.

  4. Would it be possible to plot according to both nucleotide and amino acid sequences?

chenhc720 commented 6 years ago

Just a quicker reminder from the discussion this morning. Please look into the possibility to:

  1. mark 0 differently (maybe a /) from small values close to 0 on the heat map.
  2. put a.a sequence after n.t. sequence.
  3. plot Vg1, Vg4 , and Vg6 separately.
  4. plot without unsupervised clustering.
  5. find a way to rank be proportion of each clones.

And please also plot according to TCRD as the diversity is much much higher!

Cheers:)

nilseling commented 6 years ago

The new heatmaps are now in the Figures folder. What I changed is that I consider the top 10 clones from old AND young animals (previously it was just from old animals). Also, for the TRG samples, I focused the analysis only on the specific chains. That's not possible for the TRD analysis and I performed the analysis without considering from which variable chain the clone comes.

chenhc720 commented 6 years ago

Really brilliant works!! Just want to quickly confirm that 0 is painted in grey?

nilseling commented 6 years ago

0 is anything < 0.1% - hope that's ok

chenhc720 commented 6 years ago

As discussed earlier, could you please rank/categorise the clones by their levels of being shared by different individuals?

Also, could you please look into the possibility to show the proportion of shared clones between each individuals, regardless how big the clones are in each individuals. Ideally we can overlay the whole repertoire between individuals rather than picking only the top 10 clones. For reference, see fig. 2d and e of this paper: https://www.nature.com/articles/ni.3686

nilseling commented 6 years ago

I added the new heatmaps to the Figures folder. I also generated heatmaps on the proportions of shared clones but didn't adjust the colour scale across all heatmaps yet.

chenhc720 commented 6 years ago

Sorry to keep asking for more and more analyses! The results look amazing and start revealing some really interesting information about the repertoires.

We would like to further take the size of the shared clones into consideration now. Instead of only plotting the number of clones, please also plot according to the number of reads.

size of shared clones = (total reads of shared clones in mouse 1 + total reads of shared clones in mouse 2) / (total reads of mouse 1 + total reads of mouse2)

If possible, could you please export both figure and excel table.

Thank you so so much!!

nilseling commented 6 years ago

The new heatmaps are in the Figures folder and the xlsx files in Results/Clones/Shared_clones. I marked the files for which I considered the size of the clone with "ConsSize"

chenhc720 commented 6 years ago

wonderful! I only have a small question about the denominator in the formula of clone size.

I think the sum read number of shared clones should not be subtracted in the denominator? It's different from the case when looking at the number of clones only, the same clones are counted twice and thus need to be subtracted. We actually need to take the reads from both repertoires when calculating clone size. Please please correct me if I'm wrong! Thank you so much!!

nilseling commented 6 years ago

It is correct to subtract the shared clones in the denominator to calculate the Jaccard index. But to avoid problems with differing clone size, I now subsamples the reads specifically for each comparison.