Closed lucygarner closed 2 years ago
Hi, Lucy! My name is Aleksandr Popov, I am a developer of the Immunarch package. Thank you for using our software!
You can load pre-computed clonotypes to Immunarch and calculate Gini coefficient with the following commands:
library("immunarch")
immdata <- repLoad("/my/data/clonotypes.tsv")
gini <- repDiversity(.data=immdata$data, .method="gini")
Replace the path in repLoad
command to your file with clonotypes.
You can find the list of file formats supported by repLoad
function here: https://immunarch.com/articles/v2_data.html#input-output-1
Here is the example how the data can look like after loading to Immunarch;
Immunarch has a built-in example which can be loaded by command data(immdata)
:
> library("immunarch")
> data(immdata)
> immdata$data
$`A2-i129`
# A tibble: 6,532 x 15
Clones Proportion CDR3.nt CDR3.aa V.name D.name J.name V.end D.start D.end
<dbl> <dbl> <chr> <chr> <chr> <chr> <chr> <int> <int> <int>
1 173 0.0204 TGCGCCAGC… CASSQE… TRBV4… TRBD1 TRBJ2… 16 18 26
2 163 0.0192 TGCGCCAGC… CASSYR… TRBV4… TRBD1 TRBJ2… 11 13 18
3 66 0.00776 TGTGCCACC… CATSTN… TRBV15 TRBD1 TRBJ2… 11 16 22
4 54 0.00635 TGTGCCACC… CATSIG… TRBV15 TRBD2 TRBJ2… 11 19 25
5 48 0.00565 TGTGCCAGC… CASSPW… TRBV27 TRBD1 TRBJ1… 11 16 23
6 48 0.00565 TGCGCCAGC… CASQGD… TRBV4… TRBD1 TRBJ1… 8 13 19
7 40 0.00471 TGCGCCAGC… CASSQD… TRBV4… TRBD1 TRBJ2… 16 21 26
8 31 0.00365 TGTGCCAGC… CASSEE… TRBV2 TRBD1 TRBJ1… 15 17 20
9 30 0.00353 TGCGCCAGC… CASSQP… TRBV4… TRBD1 TRBJ2… 14 23 28
10 28 0.00329 TGTGCCAGC… CASSWV… TRBV6… TRBD1 TRBJ2… 12 20 25
# … with 6,522 more rows, and 5 more variables: J.start <int>, VJ.ins <dbl>,
# VD.ins <dbl>, DJ.ins <dbl>, Sequence <lgl>
$`A2-i131`
# A tibble: 6,553 x 15
Clones Proportion CDR3.nt CDR3.aa V.name D.name J.name V.end D.start D.end
<dbl> <dbl> <chr> <chr> <chr> <chr> <chr> <int> <int> <int>
1 111 0.0131 TGCAGTGCT… CSASRG… TRBV2… TRBD1 TRBJ2… 11 12 17
2 93 0.0109 TGTGCCAGC… CASSVA… TRBV9 TRBD1 TRBJ2… 15 21 23
3 66 0.00776 TGTGCCAGC… CASSRM… TRBV13 TRBD1 TRBJ2… 11 18 24
4 59 0.00694 TGTGCCAGC… CASSPT… TRBV6… TRBD2 TRBJ2… 10 14 19
5 57 0.00671 TGCGCCAGC… CASSLD… TRBV5… TRBD2 TRBJ1… 15 17 20
6 47 0.00553 TGTGCCAGC… CASRGL… TRBV6… TRBD2 TRBJ2… 10 11 16
7 46 0.00541 TGCAGCGTT… CSVTGV… TRBV2… TRBD1 TRBJ2… 8 9 13
8 30 0.00353 TGTGCCAGC… CASSYL… TRBV6… TRBD2 TRBJ1… 15 17 19
9 29 0.00341 TGTGCCAGC… CASSLA… TRBV5… TRBD1 TRBJ1… 15 21 26
10 29 0.00341 TGTGCCAGC… CASSYI… TRBV6… TRBD1 TRBJ1… 14 17 20
# … with 6,543 more rows, and 5 more variables: J.start <int>, VJ.ins <dbl>,
# VD.ins <dbl>, DJ.ins <dbl>, Sequence <lgl>
...
And this is the result of Gini coefficient calculation for the example data:
> gini <- repDiversity(.data=immdata$data, .method="gini")
> gini
[,1]
A2-i129 0.2297097
A2-i131 0.2252784
A2-i133 0.2513861
A2-i132 0.2017009
A4-i191 0.3863010
A4-i192 0.3064599
MS1 0.3610387
MS2 0.1561629
MS3 0.2396675
MS4 0.1224806
MS5 0.3320779
MS6 0.1278508
attr(,"class")
[1] "immunr_gini" "matrix" "array"
Best regards, Aleksandr
Hi, Lucy! We are closing this issue due to inactivity. You are welcome to comment and reopen the issue if there are still unresolved questions.
Hi,
I would like to calculate diversity metrics (e.g. the Gini coefficient) using pre-computed clonotypes, which I have calculated these using paired TCR alpha and TCR beta single-cell TCR-seq data. Is there a way to do this within immunarch? As an alternative, I have seen that you can calculate the Gini coefficient using the DescTools package, however I am not clear on what the input required for this is.
Best wishes, Lucy