GenomicMedLab / cool-seq-tool

https://coolseqtool.readthedocs.io
MIT License
4 stars 0 forks source link

Validate exon offsets in transcript_to_genomic_coordinates() #184

Open jsstevenson opened 1 year ago

jsstevenson commented 1 year ago

Not sure if this was intentional -- but currently can lead to some counterintuitive results:

In [9]: await cst.transcript_to_genomic_coordinates(
    transcript="NM_002529.3", 
    exon_start=1,
    exon_start_offset=-1000000000000
)
Out[9]: GenomicDataResponse(genomic_data=GenomicData(gene='NTRK1', chr='NC_000001.11', start=-999843138854, end=None, exon_start=1, exon_start_offset=-1000000000000, exon_end=None, exon_end_offset=None, transcript='NM_002529.3', strand=1), warnings=[], service_meta=ServiceMeta(name='cool_seq_tool', version='0.1.14-dev0', response_datetime=datetime.datetime(2023, 8, 7, 14, 24, 28, 760682), url='https://github.com/GenomicMedLab/cool-seq-tool'))
github-actions[bot] commented 4 months ago

This issue is stale because it has been open 45 days with no activity. Please make a comment for triaging or closing the issue.

korikuzma commented 2 months ago

@jarbesfeld can you comment on how we should handle this? The example above should be resolved in #361 , but it is still an issue if they should an exon that is not the first or last.

For example:

await mapper.tx_segment_to_genomic(
  transcript="NM_002529.3",
  exon_start=2,
  exon_start_offset=-100000000
)

will return

{
    "gene": "NTRK1",
    "genomic_ac": "NC_000001.11",
    "tx_ac": "NM_002529.3",
    "seg_start": {
        "exon_ord": 1,
        "offset": -100000000,
        "genomic_location": {
            "type": "SequenceLocation",
            "sequenceReference": {
                "type": "SequenceReference",
                "refgetAccession": "SQ.Ya6Rs7DHhDeg7YaOSg1EoNi3U_nQ9SvO"
            },
            "start": 56864353
        }
    }
}
jarbesfeld commented 2 months ago

@korikuzma Since the offset can't be that large, I would return an error message saying "Invalid input: The exon_start_offset cannot exceed the difference between the start position of exon 2 and the end position of exon 1"

korikuzma commented 2 months ago

@jarbesfeld what's the priority for this?

jarbesfeld commented 2 months ago

@korikuzma I would say medium priority