IGS / gEAR

The gEAR Portal was created as a data archive and viewer for gene expression data including microarrays, bulk RNA-Seq, single-cell RNA-Seq and more.
https://umgear.org
GNU Affero General Public License v3.0
12 stars 4 forks source link

gEAR staging: Ortholog mapping issue with chicken datasets #793

Closed JPReceveur closed 1 month ago

JPReceveur commented 1 month ago

Ran into an issue with displaying a profile, from the messages seems to be related to orthologs for these particular datasets.

Steps to reproduce, choose chick collection and a gene cart (cochlear hair cells adult) or single gene (e.g. atoh1)

Screenshot 2024-07-19 at 1 33 29 PM
JPReceveur commented 1 month ago

Tagging with low priority, haven't been able to reproduce with other chicken datasets, might just be a dataset specific issue.

adkinsrs commented 1 month ago

I've found datasets where the gene should have ideally been found being the correct organism but the gene was not in the dataset for some reason.

jorvis commented 1 month ago

Seems like what is needed here is just a better error message then?

beamilon commented 1 month ago

I went through all the datasets in profiles and the error appears in every single chick datasets. So I would change that for a high priority ticket. image

jorvis commented 1 month ago

@adkinsrs Is this a difference in how the code is handling feature mapping in v2? We will not always have mapping between organisms, and in v1 we also don't have orthology files for chicken, since they're not part of the alliance of genomes datasets.

But it seems in v1 rather than putting up an error we just aren't mapping and instead only showing the genes which have the same names. That should be happening here too.

adkinsrs commented 1 month ago

Generally if the input gene does not have an ortholog to map to, I use the original gene which will be filtered out of the dataset's gene list most likely. This is a different error entirely, were ortholog mapping did not return a list of genes (should always return the map or the original), meaning there was probably an error in the script I need to correct for and handle.

adkinsrs commented 1 month ago

Should be fixed now. If a mapping file is not found, the API will catch the FileNotFoundError and handle gracefully, using the original gene symbols passed in.