I've had a request (from Yoshinobu Nakamura) for a block CG solver, specifically a block multi-shift CG solver. This is for RHMC simulations where multiple pseudo fermions are used using nroots. Acceleration is gained through the use of a block solver which reduces the total iteration count. If we implemented a block solver, this could be combined with a multi-rhs dslash for both improved convergance and raw GFLOPS.
A block solver for BiCGstab is described here (http://arxiv.org/abs/1104.0737), with CG and the additional orthogonalization required for stability described in the references therein.
I've had a request (from Yoshinobu Nakamura) for a block CG solver, specifically a block multi-shift CG solver. This is for RHMC simulations where multiple pseudo fermions are used using nroots. Acceleration is gained through the use of a block solver which reduces the total iteration count. If we implemented a block solver, this could be combined with a multi-rhs dslash for both improved convergance and raw GFLOPS.
A block solver for BiCGstab is described here (http://arxiv.org/abs/1104.0737), with CG and the additional orthogonalization required for stability described in the references therein.