Is your feature request related to a problem? Please describe.
Currently agat_sp_extract_sequences.pl (could be other scripts as well) does not support multicistronic transcripts. While this feature is often not supported by various gtf/gff tools, studies increasingly indicate the existence of translated ORFs positioned upstream/downstream/... of canonical coding sequences.
Describe the solution you'd like
When running agat_sp_extract_sequences.pl, I would like agat_sp_extract_sequences.pl to be able to handle multiple CDSs defined per transcript/mRNA feature. To start of, the tool would evaluate CDS IDs rather than transcript IDs as fasta headers (see this issue). Currently, I think the tool ignores or merges multicistronic CDSs with identical transcript IDs.
Describe alternatives you've considered
Today, it's possible to define a unique mRNA feature for each CDS, similar to the solution described here. It's a hacky solution that fails to show that multiple CDSs are from the same transcript.
Is your feature request related to a problem? Please describe. Currently
agat_sp_extract_sequences.pl
(could be other scripts as well) does not support multicistronic transcripts. While this feature is often not supported by various gtf/gff tools, studies increasingly indicate the existence of translated ORFs positioned upstream/downstream/... of canonical coding sequences.Describe the solution you'd like When running
agat_sp_extract_sequences.pl
, I would likeagat_sp_extract_sequences.pl
to be able to handle multiple CDSs defined per transcript/mRNA feature. To start of, the tool would evaluate CDS IDs rather than transcript IDs as fasta headers (see this issue). Currently, I think the tool ignores or merges multicistronic CDSs with identical transcript IDs.Describe alternatives you've considered Today, it's possible to define a unique mRNA feature for each CDS, similar to the solution described here. It's a hacky solution that fails to show that multiple CDSs are from the same transcript.