In order to download the GFF files using the ftp links provided, curl was used.
After obtaining the GFF files, they were converted to fasta files using the script /storage1/data19/Scripts/python_scripts/Convert_GFF_to_Fasta.py
import sys
input_gff_file = sys.argv[1]
output_fasta_file = input_gff_file.replace(".gff", ".fasta")
test = False
with open(input_gff_file, "r") as input, open(output_fasta_file, "w") as output:
for line in input:
if "##FASTA" not in line and test == False:
continue
elif "##FASTA" in line:
test = True
continue
if test == True:
output.write(line)
To confirm accuracy of the obtained fasta files, a custom python script was used to count the number of scaffolds and the number of nucleotides in the file.
In order to download the GFF files using the ftp links provided,
curl
was used.After obtaining the GFF files, they were converted to fasta files using the script
/storage1/data19/Scripts/python_scripts/Convert_GFF_to_Fasta.py
To confirm accuracy of the obtained fasta files, a custom python script was used to count the number of scaffolds and the number of nucleotides in the file.