Loewdin orthonormalization for tall-skinny matrices on multi-GPU systems. This is a proxy-app for wavefunction based DFT solvers.
Massimiliano Lupo Pasini, Bruno Turcksin, Wenjun Ge, Jean-Luc Fattebert, "A parallel strategy for density functional theory computations on accelerated nodes", Parallel Computing, Volume 100, 2020, 102703, https://doi.org/10.1016/j.parco.2020.102703.