broadinstitute / seqr

web-based analysis tool for rare disease genomics
GNU Affero General Public License v3.0
176 stars 89 forks source link

gCNV new data tags in seqr #2360

Closed Glemiret closed 2 years ago

Glemiret commented 2 years ago

This was discussed at our analysts meeting on Dec 16. We are available to discuss this with you if needed.

For the next gCNV exome data that will next be uploaded

  1. Annotate each CNV variant with the following tags in seqr:

    • “Identical call in previous callset” if feature 38 is true: (identical_in_round1 - TRUE if the variant has an identical call in the round 1 callset, with the exact same coordinates, for the same sample and type)
    • “Overlapping call in previous callset” if feature 40 is true (any_ovl_in_round1 - TRUE if the variant has at least 1 bp overlapped call in the round 1 callset, for the same sample and type). If possible, can you add a hover over text (or some other means) to see what the previous coordinates were?
    • “No overlap in previous callset” if feature 41 is true (no_ovl_in_round1 - TRUE if the variant has no overlapped call in the round 1 callset, for the same sample and type)
  2. Allow feature 41 (ie CNV with no overlap in previous callset) to be used as a filter seqr search in all CMG cohorts that have exome CNV calls. We recommend that this filter option appear as “New” in the “annotations” section with hover over text saying “CNV with no overlap in previous callset”.

  3. Add a tag for variants in previous callset that are not present in the current callset and name them “Not in current callset, but present in previous callset”.

hanars commented 2 years ago

For part 1 when you say "annotate" what do you actually mean. How should this be shown to the user?

For part 3, I'm not really sure what this means. These variants will not show up in search results anymore, so I assume you are referring to wanting to somehow update the already saved variant to indicate that it has been removed? This would involve completely overhauling how we sync saved variants, and I would want there to be analogous behavior for SNPs.I think this part is out of scope for this ticket. If this is important, please make a separate ticket for this, and indicate whether or not this feature should block loading gCNV fo projects (will probably add a week or two)

Glemiret commented 2 years ago

For part 1, we meant by "annotate" to add a tag next to the variant on the variant page.

Part 3 is not essential

hanars commented 2 years ago

I'm sorry, but can you be more specific? The word "tag" means lots of things and I don't know what it means to be "next to the variant". Comparing to whats already on the variant page generally works best when asking for new features. So something like "add a tag like the red constraint tag shown under genes and add it under the coordinates and over the seqr/pubmed/gnomad links" or maybe "add a tag like the yellow warning triangle shown on defragged variants next to the genotype and add it next to the consequence and number of exons"

Glemiret commented 2 years ago

We did not really mind where the tag is. If you prefer to have a specific location suggested, we can go with: add the tag under the genomic coordinates and under the seqr/pubmed/gnomad links right to the "classify". Thanks

hanars commented 2 years ago

And to clarify, for every single variant, you want it to have a tag that clearly shows whether it was previously identically called, part of an overlapping previous call, or a new call. You don't want to just have the absence of a tag mean something (i.e. don't show a tag for new calls and therefore assume anything without a tag is new)

Glemiret commented 2 years ago

I'll ask the analysts

hanars commented 2 years ago

You also still have not provided any information about what the tag looks like. Like I said previously, the word "tag" can mean a lot of different things to differrent people. When submitting new feature requests, you need to provide details of what you want it to look like, ideally comparing it to a "tag" that already exists. So "like the tags we show for gene lists" or "like the tag we show for clinvar pathogenicity" or "like the tag we show for defragged variants"

Glemiret commented 2 years ago

You can do like the tags we show for gene lists and since the words are not all going to fit in the tag (ex: Identical call in previous callset), we can see only the first word (Identical, Overlapping or no overlap) and the rest is shown when you hover over the tag.

Glemiret commented 2 years ago

Anne would keep the tag for new CNV if okay with you.

hanars commented 2 years ago

would it be okay to show the tag next to the genotype instead? As I think about this, whether a call is new or not is actually sample-specific so it makes more sense to show it there. Alternatively we could figure out a logic for showing it on the entire variants (i.e. if its new for any sample say "new", other wise if its identical for any samples say identical... or would you want it identical for every sample to say identical?)

hanars commented 2 years ago

for searching, do we want is to return SVs where they are new calls for any sample? Or would we want to specifically filter for new calls for the specific samples in that family?

Glemiret commented 2 years ago

I discussed with Anne. Regarding your first comment: great idea to make the tag sample specific and to put the tag below each individual's genotype. Can you add a tag for each sample even if the tag is the same for all family members? I think it is easier to have the same display across different families and this tag will only be for CNVs not SNV so it should not be annoying.

Regarding the search filter, we planned to search for new CNV calls only in affected individuals. We thought we would do a project-based search (include all families) selecting for example:

hanars commented 2 years ago

So the tag is not the same for all family members, in the data we receive it is per-sample. So its quite easy to do.

And that makes sense for search, thanks!

Glemiret commented 2 years ago

Perfect, thank you!

hanars commented 2 years ago

these have been added and will be rolled out as we reload all the projects. If you have any issues with how it behaves/ looks, please feel free to submit another issue