how is index loaded into memory when performing multiple queries?

Hi,

I just want to ask a quick question. Say I create index for the uniref30_2103_db database with 3 splits: mmseqs createindex uniref30_2103_db tmp --split 3 and I perform 50 queries (in a single .fasta file) on it using the colabfold_search.sh script provided on https://colabfold.mmseqs.com. Will each of the three partial index be loaded into memory for ~50 times? Assume my RAM cannot hold more than one partial index and I don't use the colabfold_envdb.

In other words, I'm wondering if mmseqs works like either 1)

for query in queries_in_fasta:
    for partial_index_file in indices:
        search(query, partial_index_file)

or 2)

for partial_index_file in indices:
    for query in queries_in_fasta:
        search(query, partial_index_file)

In the first case I guess each partial index will be loaded into RAM from storage repeatedly for num_of_queries times which is slow, but for the second case it's just once.

Thanks

soedinglab / MMseqs2

how is index loaded into memory when performing multiple queries? #527