Closed cbizon closed 9 years ago
My vote (FWIW) is model assertions.
There seems to be a lot of pressure building on the genome connect and case level repository front. We should also recognize that the EHR WG is starting to ramp up and they will most definitely be needing a way to represent case data. The phenotype WG has been attempting to work with ClinVar to find a way to get case data phenotypic info into the submission form to support the metabolic disorders clinical domain WG (not an easy task to do in isolation). So, I really feel that we need to take a stab at Case-Patient-Indication-Phenotype-Results (Findings and Interpretation) and show how it links into the Allele model. It will also provide value to the IoM Action Collaborative that is about to embark on an effort to define a structured representation of genotypes for pharmacogenetic test results and how they map to star alleles to support a pilot project where labs will send this structured data to provider EHR systems so that they can be used for clinical decision support. Seems like this could be a highly visible and very valuable place to influence and expose our allele model work (definitely should help us validate that we produced something that is worthy of consideration in the final solution).
So I would like to add 8) middle in: lab test results for genetic sequencing tests that contain structured indication, phenotypic observations provided by requester or lab, indication findings (which include indication related genotypes with allele specific assertion for the related indication), and (potentially) incidental findings (which includes genotypes and their allele specific assertions for unrelated indications).
High risk, high reward! I vote for this FWIW. There's urgency here, so it may make sense to do the assertion stuff first. It would be nice to at least do a basic draft of this model and then dive into assertions in detail. This way we could show a roadmap to external groups and such.
From: cbizon notifications@github.com Reply-To: tnavatar/clingen-data-model <reply+000ea21bfc7a64dde1b62fe6c177d606fb95871c3dd7662b92cf0000000111107bfa9 2a169ce0393ac0d@reply.github.com> Date: Thursday, March 5, 2015 3:18 PM To: tnavatar/clingen-data-model clingen-data-model@noreply.github.com Subject: [clingen-data-model] Next Steps for the DMWG (#46)
With the release of the 0.1 version of the allele data model immanent, we need to open a discussion on the next effort for the work group. We know that there will already be effort put towards supporting the ClinGenDB's use of the model, which will be the first priority, but what else should we be doing?
1) Polishing / documentation of allele 0.1 2) Creation of a reference implementation for allele 0.1 3) allele 0.2: incorporation of structural variants, other stuff.. 4) bottom up: model population data (allele frequency information) as a simple model that would make use of the allele model 5) bottom up: individual data model (genotypes, family information, phenotyping) 6) bottom up: model assertions - this would probably involve e.g. also doing 4. 7) top down: broad modeling effort, trying to get everything in at a lower detail level.
Others to consider?
‹ Reply to this email directly or view it on GitHub https://github.com/tnavatar/clingen-data-model/issues/46 .
Whatever we do next, my sense is that it should connect directly to one of the current IT demands of ClinGen. The big ones I'm aware of right now (apart from the allele registry) are:
Getting involved with any of these has issues of one kind or another:
The conclusion I reach in thinking about this is that we need we need more strategic thinking about what working groups have what dependencies on each other, and what goals each needs to clear to meet certain milestones. I'm gonna guess this won't happen overnight. Given that, it might make sense for us to divide and conquer again, with each of us engaging in one (or more) of the working groups involved in the above activities and getting to understand what their modeling requirements really are, so we can ramp up quickly when there's a project that's ready to be built (though my sense is that the case level database is going to be the first thing to show up as a priority)
I can't think of something that has not already been mentioned atm. I almost think we can't get away from doing 1, 2 and 3 (above chris) even if it's only 25-50% of our committed time. I'm a bit torn between assertions and genome connect because I like the idea of extending the model into more of the EHR type realm but I think the GC opportunity would give us the ability to show that the model holds up using clinical data.
In the end I would cast my vote as a yea for assertions, because it seem to be the most related to what we've done, and I think the group works well with an agreed upon goal.
I can see the interest in doing assertions next, not least because we've brushed against it in the past. One of the concerns I have is that it's really a moving target now. The general curation working groups are in the middle of their process, and they haven't (to my knowledge) made contact with the clinical domain working groups, which is likely to change the model and process even further.
There's every reason for us to be engaged with these processes, I'm just not sure whether we'll be able to produce a data model before their development work progresses further. It might be premature to focus the entire group on creating an assertion model right now.
While the curation groups' plans are not yet final, I hope that you are overthinking the degree to which they are moving (though I have not been on the calls, so take this for what it is worth...)
If we started assertions, I think it would be entirely valid to treat the published ACMG guidelines as the primary use case, while keeping in mind the ideas that
A) Those guidelines are not the only ones that could be used in allele curation B) Assertions will also be on entities other than allele-pathogenicity
I'd be very surprised if the groups ended up very far from a model constructed in that way.
One other point: on the way to assertions, we're actually going to need a (scaled back) individual/case.
With the release of the 0.1 version of the allele data model immanent, we need to open a discussion on the next effort for the work group. We know that there will already be effort put towards supporting the ClinGenDB's use of the model, which will be the first priority, but what else should we be doing?
1) Polishing / documentation of allele 0.1 2) Creation of a reference implementation for allele 0.1 3) allele 0.2: incorporation of structural variants, other stuff.. 4) bottom up: model population data (allele frequency information) as a simple model that would make use of the allele model 5) bottom up: individual data model (genotypes, family information, phenotyping) 6) bottom up: model assertions - this would probably involve e.g. also doing 4. 7) top down: broad modeling effort, trying to get everything in at a lower detail level.
Others to consider?