broadinstitute / seqr

web-based analysis tool for rare disease genomics
GNU Affero General Public License v3.0
173 stars 89 forks source link

Variant search using multiple gene lists #2765

Open KaitlynSparks opened 2 years ago

KaitlynSparks commented 2 years ago

Would it be possible to implement a feature that allows a seqr user to search for variants that are in at least one of many gene lists. The current interface allows searching for variants in only one gene list.

hanars commented 2 years ago

Our teams create large master lists for this purpose (containing all the genes that you would want in any single search, often with several permutations of similar lists), which has been an acceptable workaround until now. I will keep this issue open as a future enhancement

KaitlynSparks commented 2 years ago

Understood, thank you. We will do the same for now.

david-ma commented 2 years ago

Hi Hanars & Kaitlyn,

At the MCRI we have a ticket for this idea too:

Allow users to combine Gene Lists on the fly, at the search page.

This can be done at a higher level, by creating a custom gene list. However that would:

I would use something similar to this chit + dropdown design:

Screen Shot 2022-06-01 at 12 51 49 pm

Difficulties:

I think the higher confidence level would be used if there was a gene with different confidence levels in different lists.

hanars commented 2 years ago

@david-ma there would also be a question of how to handle if some of the lists were panel app lists and some were not. I think in that case the solution would be to show the panel app UI but put the non-panel app genes in the grey no confidence box. I agree about the UI you propose. If your group made these changes we would upstream them

AlessandroLas commented 9 months ago

Our teams create large master lists for this purpose (containing all the genes that you would want in any single search, often with several permutations of similar lists), which has been an acceptable workaround until now. I will keep this issue open as a future enhancement

Hi Hana,

As you suggested, we created many gene lists (~13000) based on HPO phenotypes (from hpo.jax.org; genes_to_phenotype.txt table). The problem, now, is that the loading of these lists from the postgres database, slows down the seqr "Variant Search" page. How did you manage this? Moreover, also at my institute we would like search variants using two or more gene lists.

Thanks for you time. Alessandro

hanars commented 9 months ago

Hi Alessandro,

The suggestion I gave was to solve a particular use case your group had. Our group does not run into this issue at all, as thats not really how we use gene lists. We only add relevant gene lists for particular disease areas and do not create so many sublists, so the max any project has is 800 gene lists, and almost every project has fewer than 100.

As I mentioned above, our team would be willing to incorporate the ability to search in multiple gene lists, using the UI agreed upon in the comments above. However, our team does not have the bandwidth to implement this ourselves at this time, at is low priority for our seqr instance as, like I said, its not really how we use gene lists. If another group were to submit the code change to support this, we would be happy to incorporate it into the main seqr instance and make it available to all users.

AlessandroLas commented 9 months ago

Thank you Hana.