iobio / clin.iobio

Clin.iobio - Workflow and reporting for iobio variant analysis pipeline
10 stars 5 forks source link

Come up with a consistent set of genes #391

Open AlistairNWard opened 3 years ago

AlistairNWard commented 3 years ago

Currently, iobio uses a combination of RefSeq and GenCode. Mosaic uses Ensembl. We need to use HGNC.

mvelinder commented 2 years ago

Run clin.iobio with demo data. Jump to Select phenotypes step. Manually add gene ARPC3P3. Go to Review variants step and get an error about the gene not being recognized. I found it by actually scrolling through the drop down of possible genes. So it was supposedly a gene name that iobio already recognized. I'm sure there are others like this too, I just happened to bump into this one randomly in testing clin

image image

In gene.iobio you can get suggested the gene name from the dropdown, select it, it will show up in the left menu, not analyze it, but then when you click the name in the left menu it gives the unable to find transcripts error

image image image

Same issue as above, with no comprehensive gene name set. But was another random example

AlistairNWard commented 2 years ago

Yeah, this is a bad error message - the gene isn't unknown, it doesn't have any transcripts. Then the second problem is allowing it to be entered, but only providing an error when you subsequently click on the gene name. We need a better way of handling all this stuff in general. If we don't have transcripts, we can't analyse, but we should probably keep the gene names in the list with an icon and tooltip explaining that they can't be analyzed without known transcripts.

If there are other sources of transcripts, that would also be a good thing to look at.

mvelinder commented 2 years ago

Not sure about the transcript source in iobio, but there is an Ensembl transcript for this gene here https://uswest.ensembl.org/Homo_sapiens/Transcript/Summary?db=core;g=ENSG00000256745;r=11:94188449-94188997;t=ENST00000397300

AlistairNWard commented 2 years ago

A solid 1 exon transcript! How often do we update the gff3 file we pull transcripts from?