SINTEF-AppliedCompSci / MRST

Official GitHub mirror of MRST
http://www.mrst.no
GNU General Public License v3.0
39 stars 15 forks source link

Questions regarding computation #3

Closed namjmnam closed 1 month ago

namjmnam commented 7 months ago

Hi. I've been using Octave and it is good and working well. I've noticed more than 99% of the computational intensity comes from the mldivide function (or the "backslash" solver) computation: A\b from ad-core/solvers/BackslashSolverAD.m

Is there an option to run this with GPU with CUDA support?

So far, I've been looking into some linear algebra and other libraries like superlu or umfpack. In this case, A is a sparse matrix and likely singular (even though Octave does not prompt warning about this, it seems like it simply is computing minimum redisual ||Ax-b||). At first it looked like the mldivide is a series of dependent compuations, so the parellelizing it seemed impossible, but then I've looked more and it is also plausible that it can be done, thus GPU support might help. Any ideas on this?

Thank you

moyner commented 7 months ago

Hi,

I'm glad MRST is working for you. The linear solver is a common issue when trying to run bigger simulations. MRST comes with built in support for building accelerated linear solvers and they work on Octave (I think you need version 6.4 or higher). These run in parallell on the CPU and use the AMGCL library. The iterative solver with the CPR preconditioner used by default is much faster than the default direct solver used by BackslashSolverAD.

The main steps to get this working. First, run the following (you only need to do this once if it works):

using linearsolvers
buildLinearSolvers()

Then try to use it by getting a nonlinear solver set up:

nls = getNonLinearSolver(model, 'useMex', true)
[ws, states] = simulateScheduleAD(state0, model, schedule, 'NonLinearSolver', nls)

Hopefully this should speed things up. You can check what linear solver is used by looking at the nls.LinearSolver property.

More information can be found in Chapter 6 - Faster Simulation with Optimized Automatic Differentiation and Compiled Linear Solvers of the second MRST book. You can download the chapter for free at this link.

namjmnam commented 7 months ago

Thank you for the reply. I haven't looked into the second book yet, but I will definitely check it out.

I have some more questions though.

I once attempted to save the variables A and b to .mat file and loaded to python script to use their direct solver and plugged the variables back into MRST for a comparison. This worked, but it was much slower. The problem was when I tried using iterative solver, it was a lot faster than BackslashSolverAD, but the solutions didn't converge and eventually the simulation failed. So I thought direct solver was the only option.

Any thoughts on this?

Also, if this method from the second book uses iterative solver, I think using GPU is also a viable option, as there are many GPU implemented iterative sparse solvers available. Any plans for future updates about this?

P.S. I accidentally closed the issue so I reopened again

moyner commented 7 months ago

Hi, can you check what iterative solver you get? It should be something like AMGCL_CPRSolverAD for the best performance. If you get the MATLAB-based CPR variant that does not give great performance on most cases.

With respect to GPU support transferring matrices to the GPU can often be quite expensive relative to the compute time. The AMGCL linear solver we are using for C++ linear solvers does support GPU, but the binaries are not compiled with GPU support by default since it means CUDA must be installed for the user. If you want to experiment with this, I think it would not be too hard to modify the source files to build with CUDA and use the GPU backend, but we do not currently have any plans for doing so by default. Generally reservoir linear systems perform fairly well on CPU and unless the equations are also assembled into the linear system on GPU I think you will not see a huge speedup on GPU if you have a "typical" CPU and GPU on your machine.

namjmnam commented 7 months ago

Thanks for the advice. I've tried it and wanted to share some of my experience. At first, I struggled with compiling, because mex.getCompilerConfigurations was not working. So I changed mrstDefaultMexFlags like:

function tf = has_cpp_compiler()
   % tf = ~isempty(compiler_config());
   tf = 1; % I had to force tell it that I have compiler set up
end

It's probably some issue with Octave version 8.4 (which I'm using), but I haven't looked further into the problem, so I proceeded anyway. I used g++ for the compilation and it worked well.

As for the simulation, here is my code:

nls = AMGCL_CPRSolverAD('tolerance', 1e-4, ...
                           'maxIterations', 1000, ...
                           'blockSize', 3, ...
                           'relaxation', 'ilu0', ...
                           'coarsening', 'smoothed_aggregation', ...
                           'withCPR', true); % I simply used arguments that were provided to me at the time
[wellSols, states, report] = simulateScheduleAD(initState, model, schedule, 'LinearSolver', nls);

It surely was using AMGCL_CPRSolverAD and not BackslashSolverAD, and in fact it was a lot slower in my case. Maybe it was an issue with the test example. Much to be further examined. I will keep looking. Thanks for the help.