Clinical-Genomics / scout

VCF visualization interface
https://clinical-genomics.github.io/scout
BSD 3-Clause "New" or "Revised" License
150 stars 46 forks source link

Update GT calls for cancer variants #1245

Closed moonso closed 8 months ago

moonso commented 5 years ago

It would be nice to get input from both @hassanfa and @bjhall on this one

moonso commented 5 years ago

We have a problem that is fairly huge. The cancer view is now using the template that rare disease variants is using. So if we want to update and change many things we will need that these diverge, this has to be done in an intelligent way so we don't need to maintain duplicated code. There is already now a number of similar variant templates that do very similar things. What's your input on this @dnil , @hassanfa , @bjhall , @northwestwitch ?

dnil commented 5 years ago

My 5¢:

The the three obvious main routes are 1) keep running with "if cancer" clauses on the views, dropping away the domain specific parts into subroutines / macros as we go along 2) refactor or 3) split the views entirely, and similarly aim to wrap the common components in a central place

The advantage of (1) would be that we can keep going with small increments. I'd say this is the default state.

I'd advice against a refactor two days before midsummer, but we could plan it. If we think we will have sufficient time to do it, and have the opportunity to get @bjhall and @hassanfa directly involved, this could be the fastest and most productive approach.

hassanfa commented 5 years ago

I have not properly introduced Scout to myself, but these are my thoughts:

  1. Our tracks are clear: Cancer/Rare-Disease, imho it is easier to use if cancer for cancer specific ones.
  2. AF/VF/AD/VD, and GT: maybe we might need to refactor it, cause it is not a cancer specific tag.
  3. For Variantcallers, How about enforcing an INFO field tag in VCF. If that tag exists, then pick them up and add them. Again this will be refactoring

I'd be delighted to join, after midsommer though 😄

moonso commented 5 years ago

Ok nice thoughts, it seems like we go with the if cancer stuff until it get's to much, we can deal with potential problems later. @hassanfa do you have a small VCF with non-sensitive data that includes the annotations described in these issues? The more the better, could you mail me that so I can start to work on it?

hassanfa commented 5 years ago

A VCF example of BALSAMIC's output can be found here https://github.com/hassanfa/VCFmerge/blob/master/tests/data_result/

The final file is: https://github.com/hassanfa/VCFmerge/blob/master/tests/data_result/mutect_vardict_strelka_concant.merged.sort.annotate.pair_orient.depth.vcf

szilvajuhos commented 5 years ago

I did some changes that ignores the genotype for cancer samples - managed to load a Strelka somatic snvs file with no genotypes, but I am not sure this is the way forward: https://github.com/Clinical-Genomics/scout/compare/master...szilvajuhos:ignore_genotype

dnil commented 8 months ago

Ok, as far as I know this is also not something we have issues with any longer, but feel very free to reopen if I am delusional here! 😸