KechrisLab / multiMiR

Development repository for the multiMiR database's R API
Other
19 stars 3 forks source link

Entrez IDs in results table match to multiple genes #55

Open mattbcvs opened 5 months ago

mattbcvs commented 5 months ago

Hi there,

Reporting what seems like an error in that I have multiple gene symbols in my predicted targets that match to the same Entrez ID.

801 matches to CALM1 (as expected https://www.ncbi.nlm.nih.gov/gene/?term=801%5Buid%5D) but also to CALM2 (seems an error: https://www.ncbi.nlm.nih.gov/gene/?term=805%5Buid%5D) and also CALM3 (seems an error https://www.ncbi.nlm.nih.gov/gene/808)

Seems a possible bug to fix?

Thanks, Matt

smahaffey commented 4 months ago

@mattbcvs Thank you. Yes I have struggled with IDs for each update as each source at this point may be using different versions of IDs some of which are quite out of date, so for some IDs multiple older IDs have merged or some IDs don't exist in current annotations.

With the next version once each source is loaded and sources that aren't being updated are copied over then I will make an attempt to find and resolve these types of errors.

If anyone would be willing to help beta test and help find some of these instances it would be extremely helpful. Just as you did, to mention specific genes that should not match but do in some instances. We can try to track these down and then find similar instances where this occurred to resolve these issues.

I have been able to develop scripts when the changes are more straight forward, but there may be many more that will require manually reviewing and then making appropriate changes to resolve these issues.