Multi-genome reference and query FASTAs for many-to-many queries

When comparing many small genomes, it is not possible to create individual files for each genome. For example, IMG/VR v4.1 contains 5,576,197 viral genomes. It is not possible to create this many files on most file systems, particularly in HPC environments where network file systems like NFS and Lustre are usually deployed.

For this situation, how about supplying a single FASTA file for all contigs, and a query.txt and reference.txt structured something like this?

{genome_name}\t{contig_1},{contig_2},{contig_3}....\n

ParBLiSS / FastANI

Multi-genome reference and query FASTAs for many-to-many queries #123