LuyiTian / FLAMES

Full-length transcriptome splicing and mutation analysis
GNU General Public License v3.0
73 stars 11 forks source link

Missing mitochondrial transcripts in isoform_annotated.gff3 #40

Open koehlek99 opened 1 year ago

koehlek99 commented 1 year ago

Hi,

first, thanks a lot for developing FLAMES!

I have one question about the configuration parameters and a problem regarding some missing genes/transcripts in the final FLAMES output and would really appreciate some help.

i) First, I was wondering if there is any further explanation for the different isoform parameters that can be adapted in the config file? I have an idea about some of the parameters (MAX_DIS, MAX_TS_DIST, Min_sup_cnt, strand_specific) but I would really appreciate a bit more detail about how the others impact the isoform identification step.

ii) Moreover, I noticed that some of the chromosomes/regions I was providing in the gene annotation reference were not part of the final FLAMES output. I'm using a slightly adapted gtf and fasta file that doesn't only contain human genes but also some pathogens. However, even though reads map against those genes, not a single transcript isoform for those genes is written into the isoform_annotated.gff3 and transcript_assembly.fa. Also, no mitochondrial transcripts are detected. I checked the number of reads mapping to those regions in the align2genome.bam with samtools idxstats align2genome.bam and at least for the mitochondrial genes, a lot of reads are mapping.

image

However, only those seqnames are included in the isoform_annotated.gff3:
['1', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '2', '20', '21', '22', '3', '4', '5', '6', '7', '8', '9', 'GL000191.1', 'GL000192.1', 'GL000194.1', 'GL000195.1', 'GL000218.1', 'GL000219.1', 'GL000223.1', 'X', 'Y']

Are they filtered out due to the parameters specified in the configuration or is something else happening here? It would be great to have information about those genes and transcripts as well.

Thanks a lot!

Best, Kristin