Closed pnrobinson closed 2 weeks ago
For instance, these are the one-based inclusive numbers we get from a typical publication
- Suppresor domain (1-223)
- IP3 binding: 226-578
- Regulatory/Coupling: 605-2217
- Channel: 2227-2758
If we use UCSC then we want to have say 225-578 for the IP3 region if the current Region zero-based scheme is used? It also seems that now we need to do this
from gpsea.analysis.predicate.genotype import VariantPredicates
from gpsea.model.genome import Region
region_pred = VariantPredicates.region(region=Region(start=225, end=578),from gpsea.analysis.predicate.genotype import VariantPredicates
from gpsea.model.genome import Region
region_pred = VariantPredicates.region(region=Region(start=225, end=578), tx_id=...)
but there is no advantage in exposing the Region class for users? Can we do this
from gpsea.analysis.predicate.genotype import VariantPredicates
region_pred = VariantPredicates.region(start=226, end=578, tx_id=...)
@pnrobinson yes, there is very little advantage in exposing Region
, and we should indeed go for 1-based coordinates, since they are less mind boggling.
The documentation should have a fuller example. Also, we should make this one-based, because 100% of the time users will have a publication, table, or figure with one-based notation. In any case, the documentation needs to say what numbering scheme we expect.
See
https://monarch-initiative.github.io/gpsea/stable/apidocs/gpsea.analysis.predicate.genotype.html#gpsea.analysis.predicate.genotype.VariantPredicates.region