I used -G as 2Mb because the species has huge introns, and that value was suggested in literature. I filtered for only primary alignments with good quality and then I ran Stringtie as follows:
I have an issue where two different genes are merged together, because one transcript spans across >1Mb intron.
I checked one read supporting the alignment and it wasn't a chimeric read. The alignment was supported by multiple reads, Below I show a zoom in at both ends of the transcript:
As my species has huge introns I want to keep the minimap parameters. I want to ask if there is a parameter in StringTie to set a threshold to avoid cluster the transcripts into a single gene where the distance separating their initial coordinates is very long.
Hello @gpertea
I'm trying to make a de-novo transcriptome assembly. I used minimap2 to align long and oriented full-length reads with the following command:
minimap2 -ax splice -uf -t ${task.cpus} -G 2000000 ${index} ${fastq_file}
I used -G as 2Mb because the species has huge introns, and that value was suggested in literature. I filtered for only primary alignments with good quality and then I ran Stringtie as follows:
I did this for the reads on multiple samples (tissues), so I used StringTie --merge to build the consensus:
I have an issue where two different genes are merged together, because one transcript spans across >1Mb intron.
I checked one read supporting the alignment and it wasn't a chimeric read. The alignment was supported by multiple reads, Below I show a zoom in at both ends of the transcript:
As my species has huge introns I want to keep the minimap parameters. I want to ask if there is a parameter in StringTie to set a threshold to avoid cluster the transcripts into a single gene where the distance separating their initial coordinates is very long.
Best, Salvador