Open Vijithkumar2020 opened 1 month ago
Have you checked the help? https://nbisweden.github.io/AGAT/tools/agat_sp_extract_sequences/#briefly-in-pictures
I guess the --split is useless.
Then if you want to extract everything from the start of the gene to the end of (So it contains UTR+exon+intron) -t gene is correct.
If you want to check what is in your file before to use agat_sp_extract_sequences.pl
to be sure you had 500K gene as input in the GFF use agat_sq_stat_basic.pl
prior your analyse.
This is regarding a de novo genome of a plant that was assembled lately. I used AGAT's feature extraction tool, to get the gene models predicted by AUGUSTUS. The repeat-masked genome is of size 2.6gb, and the fasta file resulted from AGAT's feature extraction file was ~600Mb, comprising 500K gene models. The following command was used for AGAT's feature extraction. I just like to know if this is the right command that was supposed to be used as my output file contains way too many short sequences.