wurmlab / afra

Genome Annotation for the Masses
http://afra.sbcs.qmul.ac.uk
Apache License 2.0
36 stars 21 forks source link

`mergeTranscript` duplicates sub-features if single transcript is passed to multiple times. #82

Closed hargup closed 9 years ago

hargup commented 9 years ago
{
    "end": 46,
    "start": 1,
    "strand": 1,
    "subfeatures": [
        {
            "data": {
                "end": 46,
                "start": 1,
                "strand": 1,
                "type": "CDS"
            }
        },
        {
            "data": {
                "end": 46,
                "start": 1,
                "strand": 1,
                "type": "exon"
            }
        }
    ],
    "type": "transcript"
}

When a transcript created from the above JSON object is passed to EditTrack.mergeTranscript in form mergeTranscript(refSeq, [transcript, transcript]) we get a transcript corresponding to the following JSON object. Notice that we have two exons and two CDS which are identical.

{
    "end": 46,
    "start": 1,
    "strand": 1,
    "subfeatures": [
        {
            "data": {
                "end": 34,
                "start": 1,
                "strand": 1,
                "type": "CDS"
            }
        },
        {
            "data": {
                "end": 34,
                "start": 1,
                "strand": 1,
                "type": "CDS"
            }
        },
        {
            "data": {
                "end": 46,
                "start": 1,
                "strand": 1,
                "type": "exon"
            }
        },
        {
            "data": {
                "end": 46,
                "start": 1,
                "strand": 1,
                "type": "exon"
            }
        }
    ],
    "type": "transcript"
}
yeban commented 9 years ago

@hargup This one is critical. Please could you address this in a PR right away?

yeban commented 9 years ago

I had a look. The bug can manifest itself in several ways and not just as duplicated subfeatures if the same transcript passed (which will never be the case outside test suites). For instance, if two overlapping exons were merged, Afra will still consider overlap junction as a splice point when it shouldn't.

yeban commented 9 years ago

Thanks a bunch for reporting this issue. It was an important catch.