ACEsuit / ACEinterfaces

0 stars 3 forks source link

test_ase_calc.py seg faults with new ACE1 #11

Open bernstei opened 2 years ago

bernstei commented 2 years ago

With the new ACE1 configured for julia, running the ase interface test (ensuring that JULIA_PROJECT is set correctly) crashes with the following stack trace. This is inside the call to the ace_c.c function energy(), specifically the call to jlE = jl_call2(_energyfcn, calc, at);.

[ADDED LATER] Note that this stack trace is probably irrelevant. The problem is the reference potential file being incompatible with ACE1.

signal (11): Segmentation fault
in expression starting at none:0
typekeyvalue_hash at /buildworker/worker/package_linux64/build/src/jltypes.c:1152 [inlined]
lookup_typevalue at /buildworker/worker/package_linux64/build/src/jltypes.c:722
jl_inst_arg_tuple_type at /buildworker/worker/package_linux64/build/src/jltypes.c:1589
arg_type_tuple at /buildworker/worker/package_linux64/build/src/gf.c:1845 [inlined]
jl_lookup_generic_ at /buildworker/worker/package_linux64/build/src/gf.c:2373 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2425
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1788 [inlined]
jl_call2 at /buildworker/worker/package_linux64/build/src/jlapi.c:256
energy at /home/cluster2/bernstei/src/work/ACE/ACEinterfaces/ase/ace/ace_c.c:92
ffi_call_unix64 at /home/Software/python/anaconda3/lib/python3.8/lib-dynload/../../libffi.so.8 (unknown line)
ffi_call_int at /home/Software/python/anaconda3/lib/python3.8/lib-dynload/../../libffi.so.8 (unknown line)
_call_function_pointer at /usr/local/src/conda/python-3.8.12/Modules/_ctypes/callproc.c:921 [inlined]
_ctypes_callproc at /usr/local/src/conda/python-3.8.12/Modules/_ctypes/callproc.c:1264
PyCFuncPtr_call at /usr/local/src/conda/python-3.8.12/Modules/_ctypes/_ctypes.c:4201
_PyObject_MakeTpCall at python (unknown line)
_PyEval_EvalFrameDefault at python (unknown line)
_PyFunction_Vectorcall at python (unknown line)
_PyEval_EvalFrameDefault at python (unknown line)
_PyFunction_Vectorcall at python (unknown line)
_PyEval_EvalFrameDefault at python (unknown line)
_PyEval_EvalCodeWithName at python (unknown line)
_PyFunction_Vectorcall at python (unknown line)
_PyEval_EvalFrameDefault at python (unknown line)
_PyEval_EvalCodeWithName at python (unknown line)
_PyFunction_Vectorcall at python (unknown line)
_PyEval_EvalFrameDefault at python (unknown line)
_PyEval_EvalCodeWithName at python (unknown line)
_PyFunction_Vectorcall at python (unknown line)
_PyEval_EvalFrameDefault at python (unknown line)
_PyEval_EvalCodeWithName at python (unknown line)
PyEval_EvalCodeEx at python (unknown line)
PyEval_EvalCode at python (unknown line)
unknown function (ip: 0x555b45aa32e2)
unknown function (ip: 0x555b45abf542)
unknown function (ip: 0x555b45ac4561)
PyRun_SimpleFileExFlags at python (unknown line)
Py_RunMain at python (unknown line)
Py_BytesMain at python (unknown line)
__libc_start_main at /lib64/libc.so.6 (unknown line)
unknown function (ip: 0x555b45a37d68)
Allocations: 21638703 (Pool: 21634084; Big: 4619); GC: 16
Segmentation fault (core dumped)
bernstei commented 2 years ago

[ADDED LATER] deleted comment which was a red herring. Adding IPFitting to the using line let it get further (and in pure julia would have been enough), but led to some sort of low level python conflict, perhaps because IPFitting uses PyCall, and so the nesting of processes was python -> C -> julia -> python.

bernstei commented 2 years ago

@cortner says it's just an incompatibility, and the test script's assets potential is actually and ACE, not an ACE1. We should create a proper ACE1, and use it in the test script when checking ACE1 functionality.

davkovacs commented 2 years ago

The potential in the test was fitted using ACE v0.8.x which is what ACE1 is as far as I understand.

So what is the problem currently? Sorry, I am a little confused.

bernstei commented 2 years ago

Ask @cortner how the incompatibility came to be, but with using JuLIP, ACE1 in the C code, it fails with the error ERROR: LoadError: JuLIP.FIO.read_dict no implemented for symbol ACE_PolyPairPot. Can you confirm that it works for you, when julia is installed following the current directions (i.e. JuLIP, ACE1, PyCall, ASE, IPFitting@0.4.3)

bernstei commented 2 years ago

@cortner @davkovacs do you want me to edit out the misleading and irrelevant stack traces in the earlier messages?

davkovacs commented 2 years ago

I am now trying to create a new ACE1 test potential that we can use. I am not sure how useful or not those stack traces are. I will probably rewrite a little the whole interface to give better error messges. For the OpenMM I have figured out a little how to do it.

bernstei commented 2 years ago

I think they are not useful (at least for this issue), because they have more to do with how things behave if you try to import ACE1 when you only have ACE installed, or try to "fix" the ACE dict loading problem by adding "using IPFitting", which are both the wrong thing to do. In general we could use a lot more error checking in all of ACE.

[ADDED LATER] deleted irrelevant stack trace, but left first one in with more explanation as to what's actually going wrong.