awslabs / palace

3D finite element solver for computational electromagnetics
https://awslabs.github.io/palace/dev
Apache License 2.0
224 stars 50 forks source link

Eigenmode Simulation JSON Config Issue #232

Closed acwhelan closed 2 months ago

acwhelan commented 2 months ago

Hi,

I am working with a colleague on eigenmode simulations using both HPC and local windows PC. However, we have encountered an issue with our results. We are both using the same mesh (q4_v2_1.msh and q4_v2.msh are the same mesh, she just named hers differently) and the same JSON configurations. The only difference is that my colleague is using a later version of the Palace software, which includes a "Postprocessing": {"Dielectric": {}} section under "Domains" in her JSON file. On the other hand, I believe that in the newer Palace build on HPC, the only available keywords are either "Postprocessing": {"Energy"/"Probe": {}} under "Domains" or "Postprocessing": {"Dielectric": {}} under "Boundaries".

I have tried both including and omitting these Postprocessing configurations in the newer Palace build, but I have not been able to achieve results similar to those of my colleague.

When I omitted all postprocessing configurations, the simulation timed out, whereas the local windows PC version finished in about 6 hours.

Do you have any suggestions on how I can replicate her results?

Thank you so much,

q4_v2HPC.json q4_v2local.json

Anne

### Tasks
sebastiangrimberg commented 2 months ago

Hi @acwhelan, thanks for opening this issue.

First, performance regressions between versions are unexpected and typically indicate something going wrong. A 6 hour simulation is very slow in the first place (how big is your model?) but if using a later version of Palace things go from 6 hours to not converging or timing out, that is a bad sign.

Second, you are correct that this PR changed the "Dielectric" keyword under "Domains"."Postprocessing" to "Energy". No other change was made and so this shouldn't affect the simulation results as long as you adjust the keyword in your configuration file.

To debug the differences, I have a few questions to start:

acwhelan commented 2 months ago

Hi Sebastian,

Thank you for the prompt response and clarification about the JSON keyword changes, the performance regression was actually due to an external InfiniBand connectivity issue! Everything seems to be working fine now!