intermine / pombemine

0 stars 1 forks source link

Overlapping features are reported that aren't overlapping #37

Closed ValWood closed 2 years ago

ValWood commented 2 years ago
Screenshot 2022-02-24 at 15 51 31

cdc2 (SPBC11B10.09) doesn't overlap with SPBC11B10.03 SPBC11B10.04c SPBC11B10.05c

(also thee reported overlapping feature lengths are odd). I suspect this one is a bug so this might be the wrong tracker.

ValWood commented 2 years ago

You can see the actual organization in our JBowse here: https://www.pombase.org/gene/SPBC11B10.09

rachellyne commented 2 years ago

@ValWood. It's not saying it overlaps the gene it's saying it overlaps the 10kb downstream or upstream region. The length is the length of that region.

ValWood commented 2 years ago

What is the region though? It isn't an annotated feature. I just can't figure out what it refers to?

ValWood commented 2 years ago

The length is the length of that region.

Do you mean it is the length is the distance from gene summary page feature?

kimrutherford commented 2 years ago

What is the region though? It isn't an annotated feature. I just can't figure out what it refers to?

There is an automatically created feature in the database for each 10kb upstream and downstream of each gene so that you can make queries like "find the features within 10kb upstream of the genes in my query".

I don't know why the "Overlapping features length" would be 12228 rather than 10000 though.

rachellyne commented 2 years ago

This is the same as #36. It's a post-process - as in they are calculated after the build. It can just be removed if you don't want them. I will look into why we are getting duplicates with an odd length though.

ValWood commented 2 years ago

Once it is displayed correctly I'll have a look and decide whether/how to configure to make it useful. I might need some input on how it is used and what the number refer to.

rachellyne commented 2 years ago

I have remembered the detail - we have two regions for each - one is the length of the region and one is the length of the region + the gene. It allows searches on overlaps to either include or not include the gene. My memory was a bit hazy as we used this a lot back in the modMine days.

danielabutano commented 2 years ago

Resolved