anstmichaels / emopt

A suite of tools for optimizing the shape and topology of electromagnetic structures.
BSD 3-Clause "New" or "Revised" License
96 stars 40 forks source link

MMI_splitter_3D #13

Open MRdaizr opened 4 years ago

MRdaizr commented 4 years ago

I have encountered some of the following mistakes. I don't know how to solve them. I would like to ask the author to take the time to help out of his busy schedule.Best Wishs. (base) m3enjoy@m3enjoy-virtual-machine:~/emopt/examples/MMI_splitter_3D$ python mmi_1x2_splitter_3D_fdtd.py [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [0]PETSC ERROR: to get more information on the crash.

MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD with errorcode 59.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. You may or may not see output from other processes, depending on exactly when Open MPI kills them.

CharlesDove commented 3 years ago

I'm also encountering this problem. It appears to not be a memory amount issue, since I tried it on a system with a few hundred GB of RAM and still had no luck. It affects any examples using the fdtd module. @anstmichaels Any thoughts on what might be causing this? Thanks! Charles EDIT: To update this, I ran Valgrind on it, and it turned up this: ==4146956== Invalid read of size 4 ==4146956== at 0x57C3D711: fdtd::FDTD::build_pml() (in /home/charles/.local/lib/python3.8/site-packages/emopt-2020.9.21-py3.8.egg/emopt/FDTD.so) ==4146956== by 0x57FEF9DC: ffi_call_unix64 (in /home/charles/miniconda3/lib/libffi.so.7.1.0) ==4146956== by 0x57FEF066: ffi_call_int (in /home/charles/miniconda3/lib/libffi.so.7.1.0) ==4146956== by 0x57FD7979: _call_function_pointer (callproc.c:871) ==4146956== by 0x57FD7979: _ctypes_callproc.cold.48 (callproc.c:1199) ==4146956== by 0x57FD80DA: PyCFuncPtr_call.cold.49 (_ctypes.c:4201) ==4146956== by 0x24550E: _PyObject_MakeTpCall (call.c:159) ==4146956== by 0x2CDD08: _PyObject_Vectorcall (abstract.h:125) ==4146956== by 0x2CDD08: call_function (ceval.c:4963) ==4146956== by 0x2CDD08: _PyEval_EvalFrameDefault (ceval.c:3469) ==4146956== by 0x292A28: _PyEval_EvalCodeWithName (ceval.c:4298) ==4146956== by 0x293642: _PyFunction_Vectorcall (call.c:435) ==4146956== by 0x2941CA: _PyObject_FastCallDict (call.c:104) ==4146956== by 0x2944AD: _PyObject_Call_Prepend (call.c:887) ==4146956== by 0x2945C9: slot_tp_init (typeobject.c:6755) ==4146956== Address 0xffffffff8bd5d6e4 is not stack'd, malloc'd or (recently) free'd

anstmichaels commented 3 years ago

Oddly I have never run into this issue, and I have run the FDTD solver pretty extensively on CentOS 7, Ubuntu 18.04, and 20.04. If anyone else encounters this issue, please pull master which has @CharlesDove's fixes and give it a try.