Closed bcorrie closed 1 year ago
Currently we have the following Repertoire level stats for Rearrangements:
/irplus/v1/stats/rearrangement/count /irplus/v1/stats/rearrangement/junction_length /irplus/v1/stats/rearrangement/gene_usage
Do we have these for Clones? Assuming yes...
What else do we want/need?
Assuming we want a Diversity statistic - see ireceptor-plus/specifications#77
What else?
yes
also V gene usage
U
On Jan 5, 2021, at 10:18 PM, Brian Corrie notifications@github.com wrote:
Currently we have the following Repertoire level stats for Rearrangements:
/irplus/v1/stats/rearrangement/count /irplus/v1/stats/rearrangement/junction_length /irplus/v1/stats/rearrangement/gene_usage
Do we have these for Clones? Assuming yes...
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ireceptor-plus/specifications/issues/78#issuecomment-754875365, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABTYBPMQ6RX5POJCNH5SHYLSYNX3FANCNFSM4VWHK2IQ.
yes also V gene usage U
Assuming for clones we have:
/irplus/v1/stats/clone/count /irplus/v1/stats/clone/junction_length /irplus/v1/stats/clone/gene_usage
gene_usage
gives gene usage for V/D/J/C genes at the subgroup, gene, and allele levels.
If we agree we need diversity (see ireceptor-plus/specifications#77) then we also have:
/irplus/v1/stats/clone/diversity
sorry no idea what i was thinking there how about functional/total clone ratio (I am assuming count is total functional)
Note for clarity and completeness, each API takes a JSON payload as parameters to specify both the set of Repertoires you want the stats for AND the specific Statistics you want...
Using the API:
/irplus/v1/stats/clone/gene_usage
With the JSON payload:
{
"repertoires":[{"repertoire_id":"REP1"}],
"statistics":["v_subgroup", "v_gene"]
}
Would get you the V subgroup and gene usage Stats for Repertoire REP1
sorry no idea what i was thinking there how about functional/total clone ratio (I am assuming count is total functional)
For /count we can define different count statistics. For rearrangements we defined four counts:
So we can do the same for clones (since each clone has a clone_count)
So one could ask for clone_count and clone_count_productive and then compute the ratio.
Given that, we probably don't need a separate clone_ratio???
I have added a /stats/clone/mutations entry point on the clone-stats branch in discussion with @ajrocha and @systemimmunologylab
This is a bit different than our other stats entry points as it takes a subject_id
rather than a <repertoire_id, sample_processing_id, data_processing_id>
triple as input. Thoughts?
Using the API:
/irplus/v1/stats/clone/mutations
With the JSON payload:
{
"subjects": [
{ "subject": { "subject_id": "SUBJECT_1" }},
{ "subject": { "subject_id": "SUBJECT_2" }}
],
"statistics": [ "total", "unique" ]
}
Will get you the mutation stats for SUBJECT_1 and SUBJECT_2. Exactly the format of the stats one gets is yet to be defined (to be provided by @systemimmunologylab)
And yes, the subjects object in the request is needlessly complex, but it currently mirrors the repertoires object in other stats and I didn't want to remove that structure until we agreed that we want these stats at the subject level (and not the repertoire level).
This is a bit different than our other stats entry points as it takes a
subject_id
rather than a<repertoire_id, sample_processing_id, data_processing_id>
triple as input. Thoughts?
subject_id
is not unique/stats/clone
entry seems to imply its mutations for clonesFor contrast, here's the rough workflow when I use the immcantation suite for B cell somatic hypermutation analysis:
It's also possibly that you might want mutation counts for unproductive rearrangements as these might be considered mutations under the null model, and unaffected by the selection process of affinity maturation.
@schristley thanks, that is helpful, the subject_id is being driven by @systemimmunologylab use case, so I am not sure what the expected output would be. I "think" it might be similar to your last bullet point, total mutations for all clones in a subject. Just not sure how that is represented...
I suppose we would need to determine how to resolve that in the ADC context. Maybe it boils down to getting all of the repertoires from a subject in a study, and then have the Stats API return repertoire level stats (as they normally do) and then if you want subject level mutation counts then you sum up across all the repertoires for that subject... Not sure if that makes sense or not.
@systemimmunologylab we need some feedback here...
In the recent discussion, there was a mention of conservative vs non-conservative amino acids changes, though it wasn't clear how those were defined. Shazam allows mutations to be defined based upon amino acid properties, does one or more of these properties match the conservative/non-conservative definition?
And if so, is there a particular reason why we would define/provide just one, why not all? Also, why not mutations that change the amino acid, regardless of the amino acid properties?
yellow hydrophobic / burried, red hydrophilic/ surface and blank neutral / intermediate
substitution within category is conservative. between is non conservative.
this is the first paper using these definition. Hershberg, U. and Shlomchik, M. J. (2006) Differences in potential for amino acid change following mutation reveals distinct strategies for kappa and lambda light chain variation. PNAS Vol.103 No.43 pp. 15963-8. PDF
the chothia ref is number 20 in this paper
Did not implement Stats for Clones as part of the project, closing this issue.
We have a list of Stats that we have implemented for v1 of the Stats API for rearrangements.
Do we need all of these for Clones?
What else do we need for Clones?