Open jerome-f opened 1 year ago
Hi,
I am not sure I understand what your question is. The issue title is on "multiple masks" but in the text you are talking about multiple annotations per variant/gene pair.
Can you give a detailed example of the use-case you have (e.g. annotations for the variant/gene pair and set of masks you want to evaluate)?
Cheers, Joelle
Hi Joelle,
The use case in the above scenario would be:
Annotations:
rsid1 gene1 missense rsid1 gene1 splice_variant rsid1 gene1 conserved rsid2 gene1 plof rsid1 gene2 splice_variant rsid2 gene2 missense
setlist:
gene1 rsid1,rsid2 gene2 rsid1,rsid2
maskdef:
plof plof plof_splice plof,splice plof_miss plof,missense splice splice conserved conserved
the masks plof_splice and plof_miss will have conflict as the same variant can have two different consequence for different genes. (I am not exactly certain how frequent these happen in dbNSFP).
Hi,
The same variant can have different annotations for different genes but not different ones for the same gene. I think a workaround would be to use the most deleterious annotation so you have a single annotation per-gene. Alternatively, Perhaps the 4-column annotation file format (designed initially for protein domains) could be useful here (it allows for different annotations for the same variant in a gene across domains)?
Hi, I recently noticed that regenie allows for variants to be part of more than one gene set, but does not accept multiple annotations for the same variant-gene pairs. i.e. something like
rsid1 gene1 missense rsid1 gene1 splice_variant rsid1 gene1 conserved
I understand that having duplicate annotations for the same variant-gene will lead to the issue of double counting. But this can be handled by specifying an hirerachy e.g plof>missense>synonymous, that can be specified by the user and when combining masks to test in case of multiple annotations regenie can select for the one based on the provided hirerachy. I can see multiple use cases where this feature might come in handy