Closed ghost closed 6 years ago
Hi Duncan,
It is currently impossible in amgcl to do such an update operation. Whenever the system matrix changes, you have to rebuild the preconditioner. However,
Hi Denis,
Sorry for the extremely late reply. I ended up porting the majority of my code to c++ and then implemented your second suggestion. I took advantage of the non_copy constructor and changed the matrix externally. I took a while to find the threshold for number of iterations but, now it's working well and it's a nice solution.
Thanks for your suggestion, Duncan
Thanks for following up on this! Out of interest, how much faster is your current approach than just rebuilding amg preconditioner on each step?
So I found that it was very rare for the pre-conditioner to ever be reconstructed, in my test case the construction of the AMG was about 5 seconds and solution was about 0.5 seconds. I'll add a counter into my code to see how often it occurs.
I have a feeling the solving performance was only achieved because I also copied the previous solution and used it as an initial guess of the solution. I found reusing the initial guess would speed up the solution by a factor of 2-3X. I am also using a low relative tolerance 1e-4.
in my test case the construction of the AMG was about 5 seconds and solution was about 0.5 seconds
The fact that AMG construction takes 10 times more than solution seems strange to me. Are you using a GPU backend? Or does the solver makes just a few iterations until the required convergence? Another thing to check is if you are enabling compiler optimizations (compile in 'Release' mode). This usually has dramatic effect on performance of C++ code.
The solver takes 3-5 iterations typically, as more BC are added it increases but it was rare that it would ever surpass 10.
I am using the release mode of the MSVC compiler. The target platform is piz Daint (I see from your benchmarks that you are familiar with the system), these tests were done on a laptop so there could be other things which effected. I can update this in a week or so with data from running Daint.
I am using a pure CPU version. I would have thought that this approach would not work on a GPU as you copy everything over?
Same approach should work on a GPU, as reported in [1]. In fact, it should be more beneficial on a GPU, since the preconditioner setup is relatively more expensive (setup is performed on a CPU, while solution is done on a GPU).
[1] Demidov, D.E., and Shevchenko, D.V. "Modification of algebraic multigrid for effective GPGPU-based solution of nonstationary hydrodynamics problems." Journal of Computational Science 3.6 (2012): 460-462., https://doi.org/10.1016/j.jocs.2012.08.008
Thanks for the paper, I will be certain to cite it in my work.
I added some timers to my code and the AMG generation is 8.5 seconds while solution time varies from 0.05-0.3 seconds. I based the implementation on your shared memory benchmark, the differences being the tolerance, the use of SPAI-1 for relaxation and the matrix is not of uniform diffusivity. The problem size is 75**3. Not sure if this makes any more sense to you.
We have a some GPU hours on Daint, but mainly CPU hours. I could try and benchmark it with the GPU, do you have an example CMAKE file? I guess CUDA is well supported there, but I have heard nightmares about using openCL at CSCS.
Ok, I believe the the use of spai-1 explains the slow setup. In my experience, the smoother takes a long time to construct, and is not very effective, so I would try to switch to spai0, damped_jacobi, or ilu0 (in that order).
We have a some GPU hours on Daint, but mainly CPU hours. I could try and benchmark it with the GPU, do you have an example CMAKE file? I guess CUDA is well supported there, but I have heard nightmares about using openCL at CSCS.
If I recall correctly, I also failed to compile anything OpenCL-based on PizDaint, but CUDA backend, or VexCL backend with CUDA as its backend should work well. I don't currently have access to PizDaint, but this was the code (complete with CMakeLists.txt) we used to run the benchmarks there: https://github.com/ddemidov/amgcl_benchmarks
Ok changing from SPAI-1 to SPAI-0 puts the AMG generation at 0.5 seconds. I cannot really tell if the solution time is different as the variance between iterations is quite high. At the moment I would say it is identical.
Hi Denis,
My research involves modelling biological processes, for example blood vessels growing into a tissue. I use the AMGCL library for solving the diffusion of molecules within the tissue domain. The blood vessels act as a dirichlet BC for the Poisson, acting as a source and a sink for different molecules, so I have a process which roughly like:
So at each iteration the BC are changing as the vasculature network grows. I was wondering if you thought it would be feasible for me to "update" the preconditioner each iteration rather than recreating it. I currently do this for the matrix but this doesn't save as much time as if I could also modify the MG.
Cheers, Duncan