Open lonAlpha opened 7 years ago
Hi, sorry for the delay in getting back to you over the holidays.
A calculation of this size should certainly not be taking 6 minutes. Likely culprits include (a) the code is not getting correctly built with multithreading support, (b) you are linking against a non-multi-threaded BLAS library. It should be possible to figure this out by looking at the .log
files produced by any SCUFF-EM code. If you'd like to post your .log
file here or insert a relevant snippet I'd be happy to take a look.
Hi, this is my log scuff-scatter.log.txt You may notice that it takes more than 2 minutes to assemble BEM matrix. My CPU is Intel i3-2350m. I run FEKO on the same laptop.
Hi, sorry for the delay again. From the log file, it's clear the code is using multithreading, so that's not the problem. Your CPU is significantly less powerful, and has less cache, than the machines I usually run on, so it's hard to do a direct comparison. 2.5 minutes to assemble the BEM matrix does seem slower than one would like, but not catastrophic. Note that subsequent matrix assemblies within the same run of the code (for example, calculations at different frequencies) will be faster due to caching of frequency-independent components.
What compiler are you using? I think you will get significantly improved performance with the intel compilers.
I have a similar concern with scuff-rf. I tested "WireAntenna" (SquareCoil_79) from the example directory. It took 30sec without and 10sec with multi-threading (Intel i7-3770 @ 3.4GHz). I always thought that solving the BEM matrix would be the most time-consuming task, but instead it's completely negligible. Assembling the BEM matrix needs all the time. Is this usual or is here something wrong?
The profiler says:
% cumulative self self total
time seconds seconds calls ms/call ms/call name
35.42 366.82 366.82 2207189292 0.00 0.00 scuff::CFDIntegrand3D(...)
19.13 564.88 198.06 rule75genzmalik_evalError
14.05 710.41 145.54 51460075 0.00 0.00 scuff::TaylorDuffySum_FIPPI(...)
5.59 768.34 57.92 __logl_internal
Indeed I would like to help in speeding up scuff-rf, but is there something I can do? Is the above-mentioned "caching of frequency-independent components" also implemented in scuff-rf?
Thanks a lot for Scuff-EM, it's great indeed!
It is totally common for the BEM matrix assembly to dominate over the cost of solving the system. That is almost always the case, unless you are using a very slow BLAS/LAPACK installation.
I don't think 10 seconds is a particularly long time for this calculation! Note that you only need to form and factorize the BEM matrix once for a given geometry at a given frequency, and can then reuse that result to solve scattering problems involving any number of incident fields.
Thanks for the fast answer! Good to know that I've done everything right. :-)
feko_sphere.zip Hi, I find that the runtime is much longer than commercial MoM software FEKO. Is there an easy way to improve this? Thanks. I use a sphere with 848 panels as a benchmark. On my laptop, FEKO takes less than 30 seconds to finish (I've deselected Symmetry setting). The runtime of scuff-scatter is 6m5.516s.
Runtime of scuff-scatter: real 6m5.516s user 10m44.400s sys 0m3.836s
Problem size: interior vertices - interior edges + panels = euler characteristic 426 - 1272 + 848 = 2