500K Gene Models with Many Short Sequences: Valid AGAT Output or Command Error?

NBISweden / AGAT

Another Gtf/Gff Analysis Toolkit

GNU General Public License v3.0

462 stars 56 forks source link

This is regarding a de novo genome of a plant that was assembled lately. I used AGAT's feature extraction tool, to get the gene models predicted by AUGUSTUS. The repeat-masked genome is of size 2.6gb, and the fasta file resulted from AGAT's feature extraction file was ~600Mb, comprising 500K gene models. The following command was used for AGAT's feature extraction. I just like to know if this is the right command that was supposed to be used as my output file contains way too many short sequences.

agat_sp_extract_sequences.pl \
--gff /output_file.gff \
--fasta /media/masked.fasta \
--output /out.fasta \
-t gene --split

NBISweden / AGAT

500K Gene Models with Many Short Sequences: Valid AGAT Output or Command Error? #495