Closed yesimon closed 6 years ago
Thanks for catching this! This happened because when writing the data .fasta
files I determine if a genome is segmented based on whether the sequences have a segment name, but when writing the .py
files I determine if it's segmented based on whether the number of segment names is >1. Genomes in these datasets have sequence(s) with a segment name but only one segment (e.g., all sequence(s) are labeled "segment X"); therefore, I was treating them as segmented when writing the .fasta
files but not when writing the .py
files. I fixed the script that generates these to be consistent in deciding whether a genome is segmented, and reran it for these 19 datasets.
These human host genomes have
virus.fasta
as their fasta path but their actual path isvirus/[0-9a-z]+.fasta