Closed splaisan closed 1 year ago
You can make a GTF file (with AGAT) in order that exons of different isoforms get the same gene_id, then using a sed and a awk command you remove all features excepted exons as well as all ID\Parent attributes. Finally you can use the agat_sp_extract_sequences.pl command you mentioned.
Thanks for the great tool I would like to extract the fasta sequence of the longest artificial concatenated gene model resulting from merging all exons and resolving overlapping exons by merging them too.
The best I could get now with
agat_sp_extract_sequences.pl // -t exon --merge
are full transcripts but I get several transcripts for the same gene while I want only one (even if it does not code due to frameshifts.Can you please help me with a strategy to achieve my goal Thanks