ga4gh / va-spec

An information model for representing variant annotations.
14 stars 2 forks source link

Therapeutic Response Annotation Definition and Scope #31

Open mbrush opened 5 years ago

mbrush commented 5 years ago

Initial thoughts and proposals outline below, based on info collected in the requirements doc here, and related tickets.


Definition: A statement about the utility of a variant as a predictor of how patients with a particular condition may respond to a particular therapeutic intervention for that condition.

Scope: Minimally this should allow description of the general therapeutic efficacy of the treatment (e.g. 'sensitivity' vs 'resistance'). This is the primary use case to support data from knowledgebases like CIViC (e.g. this assertion that "BRAF V600E predicts sensitivity to [Trametinib and Dabrafenib] in Melanoma"). But considering the notion of 'response to treatment' more broadly, there are other things we might capture in this (or a related) VA type, e.g. :

Key questions to resolve for us are which of these categories of statements do we support in our efforts, and for those supported, how to lump or split them into separate VA types.

In issue #9 @malachig nicely lays out considerations around scoping to cover these different cases - and benefits of using a common structure and semantics for representing them, but not necessarily using a single VA type in doing so. There are pros and cons to lumping vs splitting VA types here, but as Malachi points out "researchers working on identifying positive predictors or response are generally quite distinct from those working on predictors or adverse response" . . so having separate VA types for these categories of statements could be more immediately intuitive to these two communities.

Below is an initial list of the types of information/elements necessary to represent the core statement made in Therapeutic Response annotations (e.g. statements like "Somatic Variant X predicts/confers sensitivity to treatment of Condition Y with Drug Z")

Straw man proposals for how we might model the diversity of response categories as a single/collapses VA type, vs a split into two VA types, are outlined below.

mbrush commented 5 years ago

A few initial questions/issues to discuss:

  1. Variant origin - guessing this is typically somatic, but can be germline?

  2. How broad is scope of this VA type (see above), and what other VA types will be needed to cover out of scope statements. Settling this will inform most pragmatic modeling approach.

  3. Modeling: should we capture specific response in predicate, or in a qualifier? May depend on what we decide about scope above.

  4. Data Examples: We have plenty of examples of general therapeutic effect annotations (sensitivity/resistance) from sources like CIViC, OncoKB, MyCancerGenome. But no examples of the other categories of therapeutic response (e.g. side effects, pharmaco-toxicity, more specific beneficial responses). These will be important for deciding modeling approach and patterns.

ahwagner commented 5 years ago

1) Yes, typically somatic, is occasionally germline 2) When I think of this annotation type, it is always of statements related to clinical outcome; drug resistance/sensitivity, adverse response, etc. The details of what causes this adverse response/what the adverse response is are highly variable, and is not formalized in the VICC KBs. I would argue for a more conservative scope at first, and broaden/create another VA type if we feel that we're inadequately meeting the needs of a specific annotation type.

Also, I think that we should require a condition/indication of (1..1); worth discussing further on call tomorrow.

mbrush commented 5 years ago

Adding straw man here for what things might look like if we split into 2 or 3 closely aligned VA types, vs lump everything into a single annotation type. Both approaches should be capable of representing basic therapeutic effect (sensitivity vs resistance), and also more granular characterizations - including: (1) how a tumor/patient shows sensitivity to a drug (e.g. apoptotic death of tumor cells); (2) side effects that can result from the treatment (e.g. neutropenia, heart palpitations); and (3) aspects of the pharmacokinetics or toxicity of a drug (e.g. 'intermediate metabolizer'). . . (if we decide all three are in scope for VA efforts).

Split Model Straw Man

Creates 2-3 similarly structured VA types: 'Variant Therapeutic Response' for assertions about general sensitivity/resistance, 'Variant Adverse Response' for assertions about side effects, and likely a third 'Variant-PGx-Response' for assertions related to drug metabolism and toxicity. In each, a relatively high-level predicate is paired with the predictedResponseQualifier that allows more precise characterization of the nature of the response, if desired.

1. Variant Therapeutic Response: def =A statement about a variant as a predictor of the general efficacy of a treatment in patients with a particular disease (i.e. whether it predicts sensitivity or resistance to the treatment)

2. Variant Adverse Response (aka Variant Side Effect):

3. Variant PGx Response:


Lumped Model Straw Man

Note that because we structure each split annotation in a similar way above, it would be relatively easy to collapse if we chose to do so.


My initial feeling on this is to split:

Of the three proposed VA types, I would prioritize generic Therapeutic Response annotation. It seems simplest, most immediately/broadly useful, most relevant to drivers (e.g. VICC), and what our initial requirements analysis was based on.

larrybabb commented 5 years ago

My only experience with VA types related to pharmacogenetics is the CPIC / PharmGKB work that is the most well-known standard for establishing knowledge of gene-to-drug associations. I am not well-versed in the origins and use of this data but I think it is focused strictly on germline knowledge of gene haplotype & diplotype annotations which can be used to assess patients/subjects to determine the predicted phenotype of a patient in the context of a drug or class of drugs. My understanding is that they can capture haplotype annotations which can predict the level of "function" of the gene's protein as well as the diplotype annotations which can be used to predict things like drug metabolsim, drug efficacy (responsiveness thru resitstance), drug toxicity, etc...

@rrfreimuth is well-versed in the PGx CPIC world and should be included in this discssion.

It might be good to clarify if we want to differentiate these "somatic" or "cancer" related therapeutic response annotations from the germline world standards and vocabulary established by CPIC.

If we want to coordinate these annotations with the somewhat mature work that has been developed in the germline community we may want to start by looking at PharmVar, understand the scope of annotations, history and baseline standards they have developed. This way we may avoid inventing an alternate standard for overlapping or similar data sets.

Other helpful background

NOTE: you may find it informative to check out Tables S2, S3 and S4 listed in the Tables and figures included in the supplement section of the example above.

dsonkin commented 5 years ago

Sorry, if it was already discussed, model should also support combination of variants and also a concept of wild type status.
For example: "Variant X AND Variant T predicts/confers sensitivity to treatment of Condition Y with Drug Z" "Variant X AND WT_KRAS predicts/confers sensitivity to treatment of Condition Y with Drug Z"

arpaddanos commented 5 years ago

In CIViC drug response (ignoring pharmacogenomic right now) we have 6 conditions. (supports,does not support) + (sensitivity, reduced sensitivity, resistance). I have guidelines that I use and that I have encouraged those I train to use. "supports + sensitivity" is used if a variant shows sensitivity in comparison to a wt control, or in a case study if a patient with a variant responds (although in this case one lacks the wt control, but often a response + variant can be meaningful if there is not much else to go on so I use supports sensitivity here). L858R is an example here. "Supports + resistance" means that a variant shows resistance compared to controls that lack that variant. That can be in the background of a sensitizing mutation (T790M with L858R) or on a wt background that usually would respond to a non-variant-targeted drug like chemotherapy:

https://civicdb.org/search/evidence/3a55c849-bf04-42bb-a0f3-d4f629f635aa

"Does not support" statements become a bit more subtle but IMO very useful. "Does not support + sensitivity" can be used for example when usually variants in a given gene are associated with inducing senstivity to a drug, but then one finds a variant that does not seem to induce this sensitivity. (Here one can get that finding without control. Like a case study of a patient with an EGFR variant that did not respond to first line erlotinib would merit making a case study level EID for "does not support sensitivity" for that variant in NSCLC with erlotinib). (EID=evidence item - the fundamental curation unit in CIViC)

"Does not support resistance" has a similar thinking - when say one has a wt background that responds to a drug, but one has found some variants in a gene that seem to be associated with resistance to that drug. Now one encounters a new variant in that gene and one naturally asks will this one induce resistance? And if one finds patients with that variant that do respond, then one makes does not support resistance EIDs for that variant disease and drug combination. We have made a bunch of these in CIViC for SNPs found on BCR-ABL fusions.

Finally there is an interesting case of "does not support reduced sensitivity". This is a perfect EID type for noninferiority studies. If one has a study that shows that (in the context of some variant) drug Y is not inferior to drug X, then for that variant one can make the EID does not support reduced sensitivity for drug Y.

DavidTamborero commented 5 years ago

(A) +1 to @arpaddanos that is important that the model contemplates negative evidences. In the CGI, we used the 'no responsive' label for those cases (as opposite to 'responsive' and 'resistance') but the CIVIC model adding the 'support/does not support' plus the term is indeed more flexible

(B) another potential issue is whether the term itself should include the level of evidence; i ve seen people referring to e.g. 'sensitivity' vs 'response' when referring to pre-clinical vs clinical supporting data, respectively. My approach is to use the same term (e.g. 'response') and just state different levels of evidence (pre-clinical, case report, early clinical trial, late clinical trial, guidelines etc)

(C) I do not have strong opinions on whether is needed to separate or not in different models depending on the drug 'context', since this may answer technical considerations i m not that familiar with. However, i have to say that these three contexts are quite similar and I would vote to use a single one (specially, when the pathogenic model in cancer vs germline conditions --which are much more divergent in my opinion-- has been fused!)

ahwagner commented 5 years ago

(A) +1 to @arpaddanos and @DavidTamborero on (Support / Does not support) for describing evidence for a VTR annotation. The value set Arpad proposed {predicts_sensitivity_to, predicts_resistance_to, predicts_reduced_sensitivity} is also good, but we need to clearly define the distinction between "resistance" (the reduced efficacy of a therapy for a disease compared to reference sequence) and "reduced sensitivity" (the reduced efficacy of a therapy for a disease when compared to another therapy, in the presence of the associated variant).

(B) I agree with David's approach; removing an inferred evidence level from the predicate makes recording of these variants less error prone (terminology-naïve users will be less likely to incorrectly choose sensitivity vs. response as they can seem like apparent synonyms).

(C) I think that a split model is good. In particular, for PGx / VAR (these might be combinable)? there isn't necessarily a disease context. For VTR I would argue there must be a diseaseContextQualifier field, as the predicate describes the response of the disease to an indicated therapy. Splitting VTR from the other contexts will help in enforcing this in the data model. VTR, similarly, wouldn't particularly benefit from a required predictedResponseQualifier (and in my experience, isn't modeled or captured in existing data), though it would be necessary for PGx / VAR. This is because the response is usually strongly tied to the disease. My mindset is on cancers, though; perhaps there are diseases where there are multiple modes of "therapy", e.g. psychological disorders that are potentially treatable by multiple different molecular pathways. However, at that point we'd be modeling the mechanism of action for drugs, which I think is out of scope for VTR; most clinicians simply want to know if patients with a specific variant and disease respond relatively well or poorly to indicated therapies for the disease.

(D) @dsonkin mentioned how multiple associated variants are required for these annotations. I think that's in the scope of variant modeling, and should be addressed by VR. To his point, though, it's a common variant "type" and should be captured either as a single "compound variant" / "molecular profile", or as multiple atomic variants grouped at the VA level.

ahwagner commented 5 years ago

Points we want to capture:

@dsonkin suggested that when modeling the therapy object, we want to consider capture recommended usage and toxicity (as applicable) information.

I think there is a separate issue for the associated with / confers / predicts / etc. part of the predicate, but wanted to note that an "N/A" predicate should be added for when an assertion describes the variant as not informing clinical action.

mbrush commented 5 years ago

On the March 6 call we made great progress in defining scope, and a pragmatic split between general therapeutic response and more pharmacogenomic/side effect-related annotations. Some specific outcomes are below.

Outcomes:


For next call: Evaluate/refine proposed split model for the VTR annotation above, specifically:

arpaddanos commented 5 years ago

@DavidTamborero "My approach is to use the same term (e.g. 'response') and just state different levels of evidence (pre-clinical, case report, early clinical trial, late clinical trial, guidelines etc)". This makes sense to me, and is also essentially exactly what we do in CIViC. We use sensitivity/response as a single term and label each evidence item (EID) with very similar evidence levels as the ones you outlined. Dienstmann is one who also made those distinctions between sensitivity and response that you mention. And in attempting to adopt some of his proposed guidelines (PMID 24768039) we came to the decision to just adopt a combined term sensitivity/response (30311370) in an update to the predictive/theraputic data structure. Also @ahwagner mentioned further defining when these terms are used (preclinical vs clinical) and that would be a good idea. For me the ideas on how these things can be implemented just evolved while curating. A priori I found it hard to anticipate how they might be implemented. Actually Dienstmann intended reduced sensitivity to only be used in preclinical settings if I recall correctly. But then it turned out "does not support reduced sensitivity" worked remarkably well for clinical noninferiority when I came across that example while curating.

arpaddanos commented 5 years ago

Some examples one could look at: EGFR L858R NSCLC erlotinib EGFR L861Q NSCLC erlotinib BCR-ABL1 T315I CML dasatinib BCR_ABL1 M351T CML dasatinib EGFR L858R NSCLC erlotinib vs gefitinib

mbrush commented 5 years ago

Outcomes/Open Questions from March 20 Call:

dsonkin commented 5 years ago

In most cases variant in question would be predicting reduced efficacy relative to gene without such variant. For example reduction of sensitivity to ABL1 inhibitor (imatinib, nilotinib, etc.) in CML with BCR-ABL1 fusion with variant in comparison to CML with BCR-ABL1 fusion without variant.

mbrush commented 5 years ago

Thanks @dsonkin. For "predicts reduced sensitivity" assertions then, we are asserting that the drug still show an therapeutic effect but it is not as significant when the variant is absent.

To be clear here, this could mean that the 'absence of the variant' means a WT gene, or that some other variation affects the gene - but just not the annotated one (e.g. comparing sensitivity of a BCR-ABL fusion + M351T vs sensitivity of BCR-ABL without this additional variant). The point is that there is some other WT or variant state where sensitivity has been demonstrated, and this new genetic state shows a relative reduction sensitivity to the same drug for the same disease (i.e. still some sensitivity, but not as much as without that particular variant).

If this is the case, should the model include a qualifier to capture this comparator (i.e. the genetic state that the effect is reduced relative to)?

It would be nice to see some examples of such annotations. A search in CIViC reveals this list, which could be a starting point for an expert to find some good examples. (e.g. this one).

arpaddanos commented 5 years ago

The original impetus for reduced sensitivity comes from Dienstmann. At the first CIViC curation workshop he did talk some about this. I did not fully follow the subtleties at the time but those ideas I believe are also laid out in this paper: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5528527/ and the section discussing reduced sensitivity:

In order to give unbiased and more detailed information with regard to the magnitude of the biomarker‐related drug effects, we expanded the typical binary classification of responsive and resistant. In the advanced clinical setting, effect is outlined as responsive, resistant or not responsive (when an expected responsive effect is not observed). In preclinical models, biomarker–drug associations are graded as sensitive, reduced sensitivity or resistant. Examples are seen in Figure 3 and Table 3.

In Table 3, one has the following as an example for this type:

EGFR | Exon 20 | Insertion | Lung | Reduced sensitivity | erlotinib, afatinib | Emerging | Preclinical | 21764376

where the pubmed ID is the last field. Looking at that reference, it is a review that discusses resistance of the variant to the drugs in NSCLC. From the abstract:

Preclinical models have shown that the most prevalent EGFR exon 20 insertion mutated proteins are resistant to clinically achievable doses of reversible (gefitinib, erlotinib) and irreversible (neratinib, afatinib, PF00299804) EGFR TKIs. Growing clinical experience with patients whose tumours harbour EGFR exon 20 insertions corresponds with the preclinical data; very few patients have had responses to EGFR TKIs.

So in this case reduced sensitivity does not seem strongly distinguished from resistance.

In CIViC, we have been using it in multiple ways. For previously mentioned non-inferiority it is a comparison of two different drugs for the same variant disease (this can support or not support reduced sensitivity depending on outcome). But we have instances also where we use it for a variant that seems to be less sensitive to a given drug in a given cancer type in comparison to the standard (likely approved) variant targeted by that drug in that cancer type. The way the statement is being used should be quite apparent from reading the EID. But with this data type it can require reading the EID, it is not just fully baked into the structure of the fields in the structured data model.

DavidTamborero commented 5 years ago

Sorry for jumping in, but since I worked with Rodrigo for developing the biomarkers database of the CGI: reduced sensitivity in his original model refers to a drug resistance biomarker described in the pre-clinical setting (i.e. resistance and reduced sensitivity are equivalent terms but the latter used to distinguish experimental vs clinical studies) Note that we actually removed that term in CGI

hope it helps br d

mbrush commented 5 years ago

Hi all. Seems like we still need to resolve the issue of what is meant by 'reduced sensitivity to' in our predicate value set. You all have provided some very nice insights and examples that suggest a few different ways this terms is being used.

  1. patients show reduced sensitivity to drug x compared to drug y, in the context of the same variant and disease.
  2. patients show reduced sensitivity to drug x when they have variant 1 compared to variant 2 (which often builds on variant 1 in some way, e.g. BCR-ABL vs BCR-ABL + M351T), in the context of the same disease.
  3. same meaning as 'resistance', but in a pre-clinical setting (Deinstmann)

We need to decide which relationships we want to capture/distinguish, and whether to create separate predicate terms for each of them to lump some together. We also need to decide if/how to capture the 'comparator' for these types of annotations (e.g. an additional qualifier to capture the treatment or variation relative to which a reduced response is observed).

@arpaddanos @ahwagner @DavidTamborero @dsonkin @javild please confirm that I have summarized things correctly, based on all of your helpful comments above. And share your thoughts here or on the next VA call. We need to resolve this soon. Thank you!

DavidTamborero commented 5 years ago

regarding the point 3, my current thought for the data model we re developing here is that I would not use different terms to distinguish the same effect but in a clinical vs preclinical setting (I d rather use the same effect term as the setting info is already captured in an additional field)

dsonkin commented 5 years ago

I think we should not have different ways of interpreting 'reduced sensitivity' in clinical vs pre-clinical settings. Based on that I would recommend to remove point 3.

mbrush commented 5 years ago

I agree with this @DavidTamborero and @dsonkin. But still leaves questions about points 1 and 2 - are one, both, or neither of these things important to capture with our model?

GideonGiao commented 5 years ago

I am not sure if I can join the call today, so just two comments. From my perspective being involved in cancer/genomics-research and also being part of a medical informatics initiative:

Number 1 is in my opinion not possible yet, because it implies that there are clinical trials or other experiments where these drugs are compared directly.

Number 2 - does this imply linking an attribute like 'reduced sensitivity' to several mutations by a kind of logical AND. This I think could be interesting. Otherwise I think the comparison of 'reduced sensitivity' is always to patients without the variation, which of course does not imply the gene (locus) has no other variants. This kind of uncertainty is hard to reflect in an annotation.

I have to admit I didn't read the complete thread so my apologies in advance for not being 100% into the topic.

DavidTamborero commented 5 years ago

i agree with gideon, I do not know any context in which the (1) would be useful

the (2) is the one that makes more sense to me, and also as Gideon says it can be used in the context of having versus not having the mutation (the response to the drug is lower in the presence of the variant, but there is still response --so it s not a biomarker of resistance--)

hope it helps d

mbrush commented 5 years ago

I did a quick spot check of a few examples from the results of a search in CIViC for records where 'reduced sensitivity' was reported (link), and found what seem to be examples of both scenario 1 and 2:

Scenario 1 (reduced response for drug 1 vs drug 2, i.e. non-inferiority?):

  1. https://civicdb.org/events/genes/19/summary/variants/33/summary/evidence/6183/summary#evidence (erlotinib vs gefitinib)
  2. https://civicdb.org/events/genes/19/summary/variants/133/summary/evidence/6184/summary#evidence (erlotinib vs gefitinib)
  3. https://civicdb.org/events/genes/20/summary/variants/875/summary/evidence/7064/summary#evidence (chemotherapy plus ramuciruma vs chemotherapy plus trastuzumab)

Scenario 2 (reduced response for variant 1 vs no variant 1):

  1. https://civicdb.org/events/genes/4/summary/variants/1029/summary/evidence/4394/summary#evidence (BCR-ABL vs BCR-ABL + M351T)
  2. https://civicdb.org/events/genes/29/summary/variants/509/summary/evidence/4139/summary#evidence (KIT exon 9 mutations vs KIT exon 11 mutations or KIT WT . . . two comparators?)
  3. https://civicdb.org/events/genes/4/summary/variants/1023/summary/evidence/4314/summary#evidence (BCR-ABL vs BCR-ABL + G250E)

I would want an expert to confirm these interpretations, but if my understanding of these is correct, then'reduced sensitivity' can mean either scenario 1 or 2 in CIViC. We should consider if we want to make a recommendation to formally distinguish between the two.

mbrush commented 5 years ago

Consensus on 5-22-19 call was that the focus of 'reduced sensitivity' interpretations should be on scenario 2. The presence of a handful of records in CIViC that describe scenario 1 are probably cases they should not be treated in this way.

Action Items:

ahwagner commented 5 years ago

The CIViC group discussed this at length today. We agree that the use of reduced sensitivity should be limited to the use case presented in scenario 2. There are a sizeable number of preclinical studies in queue that are of similar structure to the publications in scenario 1. We are going to curate these using an alternate strategy in concert with the VA group decision here. The existing evidence in CIViC will likely be revised to statements supporting sensitivity to each drug as substitutes.

mbrush commented 5 years ago

Genomics England value set for drug response classification terms here may inform this model. @javild can you dig up some example data using these to share so we can better understand usage of these terms in the Genomics England data?

My take on their relevance for the generic Therapeutic Response VA type we are prioritizing:

mbrush commented 4 years ago

When we come back to revisit expanding or modeling here to cover other categories of therapeutic response more explicitly, the PharmGKB website FAQ (And other pages there) have some good info and links. e.g. w.r.t. consideration of the disease as a component of the statement:

"A pharmacogenetic test is a type of genetic test. A pharmacogenetic test attempts to predict how a person will respond to a drug. It might be a test that has nothing to do with a disease at all, but can provide information about how quickly the body breaks down the drug or if the patient has a risk of a bad reaction to a drug. This information can help the patient’s doctor to change the dosage of the drug or even use a different drug for that patient.

Pharmacogenetic tests can sometimes relates to the person's disease profile, e.g. for the cancer drug Herceptin, a drug that works on tumors that have a particular biomarker, a pharmacogenetic test is performed to assess if the tumor will be attacked by the Herceptin drug."

https://www.pharmgkb.org/page/faqs#how-is-a-pharmacogenetic-test-different-from-a-genetic-test