Currently, we only implement do_step functions with in and out parameter and you this implementation putting in=out in case we deal with in-place computations (usual case). For some problems and compilers (gcc) it would give ~30% better performance if we would provide a specific inout implementation that also relies on += operation (e.g. x += dx*dt). Unfortunately, this would require additional methods in the operations that provide such in-place computations.
Currently, we only implement do_step functions with in and out parameter and you this implementation putting in=out in case we deal with in-place computations (usual case). For some problems and compilers (gcc) it would give ~30% better performance if we would provide a specific inout implementation that also relies on += operation (e.g. x += dx*dt). Unfortunately, this would require additional methods in the operations that provide such in-place computations.