If num_docs is not required, there are many optimization we can run.
For instance, if we sort by docs we can often searhc on only one split and abort search within the split.\
If we search by -date too, we can sort splits by order of their max_date, and stop search as soon as we get a guarantee that no docs will enter the top K.
We can hardcode these optimization for the moment, and revisit this if someone has some great formalism to get a proper distributed execution plan abstraction.
we've added some optimizations to search on less splits (but generally more than one) when num_docs isn't asked for.
we also added said optimization when sorting by date/-date/doc_id
If num_docs is not required, there are many optimization we can run.
For instance, if we sort by docs we can often searhc on only one split and abort search within the split.\ If we search by -date too, we can sort splits by order of their max_date, and stop search as soon as we get a guarantee that no docs will enter the top K.
We can hardcode these optimization for the moment, and revisit this if someone has some great formalism to get a proper distributed execution plan abstraction.