Magdoll / SQANTI2

SQANTI2 is now replaced by SQANTI3. Please go to: https://github.com/ConesaLab/SQANTI3
Other
38 stars 15 forks source link

isoform map to scaffold which without reference gene #53

Open shenli-js opened 4 years ago

shenli-js commented 4 years ago

Hi Magdoll ! I use SQANTI2 to analyse my pacbio data. In my result, some isoforms was identified as antisense ,but the colum for associated_gene is empty. I read the code, find that line 1049-1074 :

    if len(isoforms_hit.genes) == 0:
        # completely no overlap with any genes on the same strand
        # check if it is anti-sense to a known gene, otherwise it's genic_intron or intergenic
        if len(isoforms_hit.AS_genes) == 0 and trec.chrom in junctions_by_chr:
            # no hit even on opp strand
            # see if it is completely contained within a junction
            da_pairs = junctions_by_chr[trec.chrom]['da_pairs']
            i = bisect.bisect_left(da_pairs, (trec.txStart, trec.txEnd))
            while i < len(da_pairs) and da_pairs[i][0] <= trec.txStart:
                if da_pairs[i][0] <= trec.txStart <= trec.txStart <= da_pairs[i][1]:
                    isoforms_hit.str_class = "genic_intron"
                    break
                i += 1
        else:
            # hits one or more genes on the opposite strand
            isoforms_hit.str_class = "antisense"
            isoforms_hit.genes = ["novelGene_{g}_AS".format(g=g) for g in isoforms_hit.AS_genes]
    else:
        # overlaps with one or more genes on the same strand
        if trec.exonCount >= 2:
            # multi-exon and has a same strand gene hit, must be NNC
            isoforms_hit.str_class = "novel_not_in_catalog"
            isoforms_hit.subtype = "at_least_one_novel_splicesite"
        else:
            # single exon, must be genic
            isoforms_hit.str_class = "genic"

The genome for my work is incompleted . It will always be antisense ,when isoforms mapped to scaffold which without any reference gene.

I updated SQANTI2,it isn‘t fixed

Magdoll commented 4 years ago

Hi @shenli-js , I'm not sure what you are requesting here....so some of your annotation is in-complete as some PacBio transcripts are being annotated as on the opposite strand of known genes???? -Liz