Closed ddement closed 2 months ago
@trilinos/belos
@ddement #11837 is in progress to address GMRES orthogonalization in Belos. I'm wondering if you are running into the case that the PR is meant to address.
Maybe related: #9979 ?
@ddement Do you know if the 30x slowdown is in the velocity, continuity, or both phases? It's been quite a while since I ran this test case.
@jhux2
It is used for the velocity solve. Continuity uses BiCGStab, which is a little slower (a couple seconds) on this particular run, but nowhere near the 30x number.
I'll take a look at the other PRs to see if I think they're related - I would not be surprised if they are.
This issue has had no activity for 365 days and is marked for closure. It will be closed after an additional 30 days of inactivity.
If you would like to keep this issue open please add a comment and/or remove the MARKED_FOR_CLOSURE
label.
If this issue should be kept open even with no activity beyond the time limits you can add the label DO_NOT_AUTOCLOSE
.
If it is ok for this issue to be closed, feel free to go ahead and close it. Please do not add any comments or change any labels or otherwise touch this issue unless your intention is to reset the inactivity counter for an additional year.
This issue was closed due to inactivity for 395 days.
@srajama1
We have been doing a series of experiments in Nalu-Wind regarding the removal of UVM from CUDA runs. When Trilinos is built without UVM, most regression tests run slightly faster than with it. However, one case runs approximately 30x slower when UVM is not used. In particular, this slowdown has been traced primarily to the "BlockGmresSolMgr total solve time" and "ICGS[2]: Ortho (Norm)" timing lines from Belos. Several other regression tests exercise other Belos solvers, and none of them show similar regressions.
Unfortunately, the reproducer for this case is a Nalu-Wind regression test - we do not have a more minimal problem. The regression test in question is the "taylorGreenVortex_p3" test. We can assist with running and debugging as necessary. @jhux2 may also have experience with running this case.