broadinstitute / genetic-prevalence-estimator

https://genie.broadinstitute.org/
BSD 3-Clause "New" or "Revised" License
1 stars 0 forks source link

Custom list transcript selection #153

Open sambaxter opened 1 year ago

sambaxter commented 1 year ago

I noticed while making one of my custom lists that the tool doesn't always seem to pick the canonical transcript.

I made this list for SLC13A5 and it picked ENST00000293800.6, instead of ENST00000433363.2, which is the canonical. I think we should have it default to the canonical unless the user picks otherwise.

nawatts commented 1 year ago

I wasn't able to reproduce this. It should automatically select the Ensembl canonical transcript (or MANE Select transcript if one is available) when the gene is changed.

https://github.com/broadinstitute/genetic-prevalence-estimator/blob/9940f68a90117ac7c9485c8123fef13d06ce3f17/frontend/src/components/CreateVariantListPage/TranscriptInput.tsx#L66-L106

https://github.com/broadinstitute/genetic-prevalence-estimator/blob/9940f68a90117ac7c9485c8123fef13d06ce3f17/frontend/src/components/CreateVariantListPage/TranscriptInput.tsx#L145-L151

sambaxter commented 1 year ago

It happens when you don't write in the name and just upload a list of variants. It happened again yesterday when I did this for ABCC8 using the attached file

On Thu, May 11, 2023 at 10:11 AM Nick Watts @.***> wrote:

I wasn't able to reproduce this. It should automatically select the Ensembl canonical transcript (or MANE Select transcript if one is available) when the gene is changed.

https://github.com/broadinstitute/genetic-prevalence-estimator/blob/9940f68a90117ac7c9485c8123fef13d06ce3f17/frontend/src/components/CreateVariantListPage/TranscriptInput.tsx#L66-L106

https://github.com/broadinstitute/genetic-prevalence-estimator/blob/9940f68a90117ac7c9485c8123fef13d06ce3f17/frontend/src/components/CreateVariantListPage/TranscriptInput.tsx#L145-L151

— Reply to this email directly, view it on GitHub https://github.com/broadinstitute/genetic-prevalence-estimator/issues/153#issuecomment-1544058917, or unsubscribe https://github.com/notifications/unsubscribe-auth/AESSQ2GLNFY36UM77J76YODXFTXR7ANCNFSM6AAAAAAX3AH7HY . You are receiving this because you authored the thread.Message ID: @.*** com>

--

Samantha Baxter, MS, CGC

Associate Director, Genetic and Genomic Data Sharing

Licensed Genetic Counselor

@. @.>*

11-17414635-A-T 11-17414656-A-G 11-17415880-C-T 11-17415881-G-A 11-17415905-G-A 11-17415959-C-T 11-17416718-C-T 11-17416719-C-T 11-17416807-AG-A 11-17416824-T-C 11-17417156-C-A 11-17417157-C-T 11-17417158-G-A 11-17417205-C-T 11-17417399-C-T 11-17417419-C-T 11-17417434-GAGA-G 11-17418468-G-A 11-17418527-C-G 11-17418587-G-T 11-17418602-C-T 11-17418738-A-G 11-17418753-GT-G 11-17418861-C-T 11-17419269-CA-C 11-17419956-TC-T 11-17419984-CAT-C 11-17419989-C-T 11-17424217-C-T 11-17424218-G-A 11-17424293-G-A 11-17426169-A-T 11-17426170-CAG-C 11-17426176-A-C 11-17426217-C-T 11-17428336-C-T 11-17428447-GCAGTTCCTGGCTGCAGGGGT-G 11-17428470-G-GCAGTTCCT 11-17428472-G-GC 11-17428490-C-T 11-17428597-G-T 11-17428605-G-A 11-17428617-C-A 11-17428677-C-T 11-17428685-C-T 11-17428948-G-GCACGAGATAGGCCCTGGGGTGGCTCTGTGGCTTT 11-17428964-G-A 11-17429937-A-G 11-17429962-G-A 11-17432064-C-T 11-17432139-A-AG 11-17434248-G-A 11-17434263-G-A 11-17434939-A-T 11-17436143-A-ACGGGG 11-17448594-A-T 11-17448617-G-C 11-17448667-G-T 11-17448703-T-A 11-17449412-A-G 11-17449417-G-A 11-17449510-C-T 11-17449953-C-G 11-17452359-A-G 11-17452386-G-A 11-17452420-CAA-C 11-17452431-A-AGAGGGAGAGGGAGGC 11-17453787-GA-G 11-17464266-C-A 11-17464266-C-T 11-17464321-G-A 11-17464396-C-T 11-17464431-T-A 11-17464767-G-C 11-17470105-C-T 11-17470110-T-TGAGCTGATTGGTGTCGATGGCAACCAGATTA 11-17470187-GACAGGTGC-G 11-17474664-A-G 11-17474703-GC-G 11-17482066-AG-A 11-17482100-C-T 11-17482118-C-T 11-17483146-GC-G 11-17483210-G-A 11-17485004-A-T 11-17485024-CCCAT-C 11-17485062-G-A 11-17491678-C-T 11-17491729-C-T 11-17496502-C-T 11-17496503-G-A 11-17496570-C-T 11-17498175-C-T 11-17498262-A-T 11-17498269-G-A 11-17498305-C-G 11-17417450-C-T 11-17483263-T-C

nawatts commented 1 year ago

Ooh, for custom lists. I was looking at recommended lists 🤦.

🤔 In that case, it uses the first transcript in the Hail table. https://github.com/broadinstitute/genetic-prevalence-estimator/blob/9940f68a90117ac7c9485c8123fef13d06ce3f17/worker/src/worker/tasks.py#L379

I think the intent here was to mimic the gnomAD browser's transcript selection, but I'll have to look into how those transcripts are being sorted.

sambaxter commented 1 year ago

That's fair. It's just weird because in gnomAD the canonical is always listed first so I think people will make assumptions. If we can't fix it I think we could make the "select transcript" button a little more obvious and tweak the subtext "If a transcript is unselected the variant list will show variants' most severe consequence in any transcript, which may not be the canonical transcript. This will not impact frequencies but could impact the HGVS nomenclature displayed. If a transcript is selected, the variant list will show variants' consequences in that transcript. "

[image: image.png]

On Thu, May 11, 2023 at 10:24 AM Nick Watts @.***> wrote:

Ooh, for custom lists.

🤔 In that case, it uses the first transcript in the Hail table.

https://github.com/broadinstitute/genetic-prevalence-estimator/blob/9940f68a90117ac7c9485c8123fef13d06ce3f17/worker/src/worker/tasks.py#L379

I think the intent here was to mimic the gnomAD browser's transcript selection, but I'll have to look into how those transcripts are being sorted.

— Reply to this email directly, view it on GitHub https://github.com/broadinstitute/genetic-prevalence-estimator/issues/153#issuecomment-1544081035, or unsubscribe https://github.com/notifications/unsubscribe-auth/AESSQ2BZGPDUULMOBETNAG3XFTZCDANCNFSM6AAAAAAX3AH7HY . You are receiving this because you authored the thread.Message ID: @.*** com>

--

Samantha Baxter, MS, CGC

Associate Director, Genetic and Genomic Data Sharing

Licensed Genetic Counselor

@. @.>*

nawatts commented 1 year ago

Yeah, I think something is off with the transcript sorting in the data pipeline. Looking at a few variants, I would expect the canonical transcript to be selected since they have the same VEP consequence in the canonical transcript as they do in ENST00000293800.6. When there's a tie in consequence like that, the canonical transcript should be preferred.

Also agree with adding that note to the form.