INP-PM / FEDM

Finite Element Discharge Modelling code
https://inp-pm.github.io/FEDM/
GNU Lesser General Public License v3.0
10 stars 4 forks source link

Bus failure when running with more than 12 cores #18

Closed KieranSQ closed 2 months ago

KieranSQ commented 3 months ago

I am trying to run the example files on >12 cores and whenever I do I receive the following error, without fail:

`=================================================================================== = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = PID 565 RUNNING AT 15e763a7782c = EXIT CODE: 7 = CLEANING UP REMAINING PROCESSES = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES

YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Bus error (signal 7) This typically refers to a problem with your application. Please see the FAQ page for debugging suggestions`

I spoke to our comp sci technician and he has assured me that it is not on our side!

Do you have any suggestions/tips?

AleksandarJ1984 commented 3 months ago

Dear Kieran,

I did not have this problem when running the examples on 16 or 32 cores. Anyhow, the same issue was reported here and it seems that it is caused by MPICH and Docker. Could you try to run the Docker with the following option:

--shm-size=512m

as suggested here. Please let me know if it helped to resolve the problem.

Best regards, Aleksandar

KieranSQ commented 2 months ago

Thank you! This seemed to work - we didn't spot this on the FEniCS page!