The GFF3 format (Generic Feature Format Version 3) is one of the standard formats to describe and represent genomic features. It is an incredibly flexible, 9-column format, which is easily manipulated by biologists. This flexibility, however, makes it very easy to break the format. We have developed the GFF3toolkit to help identify common problems with GFF3 files; fix 30 of these common problems; sort GFF3 files (which can aid in using down-stream processing programs and custom parsing); merge two GFF3 files into a single, non-redundant GFF3 file; and generate FASTA files from a GFF3 file for many use cases (e.g. feature types beyond mRNA).
Frequently Asked Questions/FAQ
pip install wheel
to install it.)pip install gff3tool
pip install git+https://github.com/NAL-i5K/GFF3toolkit.git
gff3_QC
- Detection of GFF format errors (~50 types of errors).
gff3_QC -g example_file/example.gff3 -f example_file/reference.fa -o error.txt -s statistic.txt
gff3_fix
- Correct GFF3 errors detected by gff3_QC.py (30 types of errors).
gff3_fix -qc_r error.txt -g example_file/example.gff3 -og corrected.gff3
gff3_merge
- Merge two GFF3 files
gff3_merge -g1 example_file/new_models.gff3 -g2 example_file/reference.gff3 -f example_file/reference.fa -og merged.gff -r merged_report.txt
gff3_merge -g1 example_file/new_models_w_replace.gff3 -g2 example_file/reference.gff3 -f example_file/reference.fa -og merged.gff -r merged_report.txt -noAuto
gff3_sort
- Sort a GFF3 file according to the order of Scaffold, coordinates on a Scaffold, and parent-child feature relationships
gff3_sort -g example_file/example.gff3 -og example-sorted.gff3
gff3_to_fasta -g example_file/example.gff3 -f example_file/reference.fa -st all -d simple -o test_sequences