Open mipals opened 1 year ago
Hi Mikkel, I think this may be available with the current wrapper. It is possible to pass a keyword argument to hfmm3d which lets the routine know that you are doing a vectorized call. So something like:
vals = hfmm3d(eps,zk,sources;charges=charges,nd=100)
If sources was 3 x n, then the routine would expect charges to be an nd x n array (and throw an error otherwise).
Let me know if that works! The vectorized call is untested.
I am not sure what to expect for the performance gains. These may be platform dependent. Would be interested to know how it goes.
Cheers, Travis
With regards to performance gain, we see a factor of 2+ for Helmholtz, and around a factor of 4 for Laplace. However, there is no benefit in cranking up nd. This effect stops pretty much after 16-32 densities. The larger you make nd, the more memory the code would need. So I would recommend calling it in batches of 16-32, whatever the memory on your system allows.
Regards, Manas
Hi again,
Thanks for the quick answers!
@askhamwhat Ahh perfect, I tried looking into the code but I did not realize that calling hfmm3d_
also included the vectorized version (the documentation mentioned something of a _vec
which i could not find anywhere). From the documentation it looks as the charges needs to be interleaved in one long vector, but I guess given Julias column-major it is the same as a nd x n
matrix. Additionally, the dipoles needs to be interleaved component-wise and not dipole-wise (something that I missed the first time around).
After having tried it out there is one thing that annoys me slightly. If you give only one set of charges the function returns a vector (n x 1
) but with multiple charges it returns a nd x n
matrix. I guess this has to do with the underlying Fortran implementation. Thankfully this is something a simple transpose can fix 😄
@mrachh Sounds about right. For a very crude implementation I got 3-8 speedup (on a M1 mac) for my specific application. However, not everything here can directly be contributed to the vectorized call as the application includes additional computations which can be stacked/improved when calling a vectorized hfmm3d
.
Cheers, Mikkel
Hey there,
I am trying to solve a problem with multiple right-hand sides. In short this means that I need to evaluate the hfmm3d for many different setups of charge strengths (or dipole strengths). I see that there is a vectorized version, however it seems that there is no interface for the function in Julia. Before I go and implement my own I want to know if there is a specific reason why there is no interface (other than it takes time to implement) as well as if there is any computational speed to gain using the vectorized version (currently, I have around 100 rhs, but this could be increased in the future).
Cheers, Mikkel