immunomind / immunarch

🧬 Immunarch: an R Package for Fast and Painless Exploration of Single-cell and Bulk T-cell/Antibody Immune Repertoires
https://immunarch.com
Apache License 2.0
306 stars 65 forks source link

geneUsage over two Genes and on proportions instead of counts #80

Open Blowfish82 opened 4 years ago

Blowfish82 commented 4 years ago

❓ Questions and Help

Good Afternoon,

I was switching to immunarch from tcR. I was missing two things in immunarch´s geneUsageAnalysis:

  1. I was regularly using: geneUsage(immdata$data, HUMAN_IGHV, .quant = "read.prop") now, only "NA" or "count" is allowed in .quant. Where is the .quant = "read.prop" option?

  2. I also was using the geneUsage over V-J Combinations like geneUsage(immdata$data, list(HUMAN_IGKV, HUMAN_IGKJ) ) now, I only see geneUsage over only one gene. Example: geneUsage(immdata$data, "hs.trbv")

This is frustrating. These two things really need to be included.

All the best!

vadimnazarov commented 4 years ago

Hi @Blowfish82

Thank you a lot for using the package!

  1. The fix is very straightforwrd: use the .norm = TRUE argument with count to use Proportions

  2. Would you be willing to tell me more how do you plan on visualising the resulting table? Or how do you use the resulting table? We removed it because the visualisation part was hard and impractical.

Blowfish82 commented 4 years ago

Hello,

thank you very much for the fast answer. In general this function is very crucial for us to explore which gene combination(s) is used more often in a disease group.

To visualize this, I melt (& acast) the resulting list to a matrix with all possible gene combinations for each sample:

                                                Sample1                                   Sample2                   ......
TRBV10-1_TRBJ1-1                         5.415615e-04                           0.0006508386
TRBV10-1_TRBJ1-2                         1.320882e-04                           0.0000000000
TRBV10-1_TRBJ1-3                         1.056705e-04                           0.0000000000
TRBV10-1_TRBJ1-4                         2.641764e-05                           0.0000000000
TRBV10-1_TRBJ1-5                         6.604409e-05                           0.0000000000

We look at this table and of course make a pca of the first two components in which each sample is a point which is colored according to its group.

image

So, recapitulating, with this function missing, I´m only able to analyze the single gene usage. It would be great, if you could include it or if there is a workaround? Even without any visualisation from your side, just the possibility to perform GeneUsage over two genes would be great!

All the Best! And thanks in advance.

vadimnazarov commented 4 years ago

Hi @Blowfish82

Very helpful, thank you so much! Yes, we definitely can do that, and even streamline the process geneUsage > PCA > Visualisation to make much easier and reduce the by-hand coding. Let me plan a bit and I will update you here on the progress.

@Blowfish82 , I believe you have a very interesting and unorthodox approach to the immune repertoire analysis. Would you be willing to have a short 15 minute interview with our intern so we can see how can we help you even more with our package? Pinging @EugeneRumynskiy for the communication purposes