Although the paper says this is approximated by the greedy algorithm, we can achieve the optimality with $O(DN^2)$ using dynamic programming (dp[d][i][j] := the minimum loss sum up to $d$ datasets, picking $i$ configurations from up to $j$ configurations). The dynamic programming itself could be a new algorithm, but it is just gonna be a workshop paper.
The proposition is to reduce the algorithm above by combining the HyperBand algorithm, but I have not got how.
Multi-fidelity zero-shot HPO
Main points
The naive approach is to pick $K$ configurations from $N$ configurations such that the average loss will be minimized:
$\min{S \in [N]} \sum{n \in S} \sum_{d \in [D]} L(x_n | \mathcal{D}_d )$ s.t. $|S|=K$
Although the paper says this is approximated by the greedy algorithm, we can achieve the optimality with $O(DN^2)$ using dynamic programming (dp[d][i][j] := the minimum loss sum up to $d$ datasets, picking $i$ configurations from up to $j$ configurations). The dynamic programming itself could be a new algorithm, but it is just gonna be a workshop paper.
Experiments used self-made setups.