alexrobomind / fusionsc

FusionSC
MIT License
1 stars 1 forks source link

abort in fieldline tracing example #4

Closed dschwoerer closed 3 months ago

dschwoerer commented 3 months ago

I tried the field line tracing example, but did not get far:

import fusionsc as fsc
import numpy as np
from fusionsc.devices import jtext

efitExample = jtext.exampleGeqdsk()
field = fsc.magnetics.MagneticConfig.fromEFit(efitExample).compute(jtext.defaultGrid())

startPoint = [1.2, 0, 0]
fieldLine, b = fsc.flt.followFieldlines(
    startPoint, field, recordEvery=10, stepSize=0.01, turnLimit=400
)

and got the following backtrace due to a malloc error:

(gdb) bt
#0  0x0000155554dd4184 in __pthread_kill_implementation () from /lib64/libc.so.6
#1  0x0000155554d7c69e in raise () from /lib64/libc.so.6
#2  0x0000155554d64942 in abort () from /lib64/libc.so.6
#3  0x0000155554d657a7 in __libc_message_impl.cold () from /lib64/libc.so.6
#4  0x0000155554dde1b5 in malloc_printerr () from /lib64/libc.so.6
#5  0x0000155554de15dc in _int_malloc () from /lib64/libc.so.6
#6  0x0000155554de37de in calloc () from /lib64/libc.so.6
#7  0x000015555383d83d in capnp::MallocMessageBuilder::allocateSegment(unsigned int) () from /usr/local/lib64/python3.12/site-packages/fusionsc/native.cpython-312-x86_64-linux-gnu.so
#8  0x000015555388c19d in capnp::_::BuilderArena::allocate(unsigned int) () from /usr/local/lib64/python3.12/site-packages/fusionsc/native.cpython-312-x86_64-linux-gnu.so
#9  0x000015555381aac2 in capnp::_::PointerBuilder::getStruct(capnp::_::StructSize, capnp::word const*) () from /usr/local/lib64/python3.12/site-packages/fusionsc/native.cpython-312-x86_64-linux-gnu.so
#10 0x000015555368907e in capnp::_::PointerHelpers<fsc::Float64Tensor, (capnp::Kind)3>::get(capnp::_::PointerBuilder, capnp::word const*) () from /usr/local/lib64/python3.12/site-packages/fusionsc/native.cpython-312-x86_64-linux-gnu.so
#11 0x000015555367f348 in ?? () from /usr/local/lib64/python3.12/site-packages/fusionsc/native.cpython-312-x86_64-linux-gnu.so
#12 0x0000155553682c1b in ?? () from /usr/local/lib64/python3.12/site-packages/fusionsc/native.cpython-312-x86_64-linux-gnu.so
#13 0x0000155553936707 in kj::_::TransformPromiseNodeBase::get(kj::_::ExceptionOrValue&) () from /usr/local/lib64/python3.12/site-packages/fusionsc/native.cpython-312-x86_64-linux-gnu.so
#14 0x000015555393bb73 in kj::_::ForkHubBase::fire() () from /usr/local/lib64/python3.12/site-packages/fusionsc/native.cpython-312-x86_64-linux-gnu.so
#15 0x000015555393542e in kj::EventLoop::turn() () from /usr/local/lib64/python3.12/site-packages/fusionsc/native.cpython-312-x86_64-linux-gnu.so
#16 0x000015555393f309 in kj::_::waitImpl(kj::Own<kj::_::PromiseNode, kj::_::PromiseDisposer>&&, kj::_::ExceptionOrValue&, kj::WaitScope&, kj::SourceLocation) ()
   from /usr/local/lib64/python3.12/site-packages/fusionsc/native.cpython-312-x86_64-linux-gnu.so
#17 0x000015555352c10d in ?? () from /usr/local/lib64/python3.12/site-packages/fusionsc/native.cpython-312-x86_64-linux-gnu.so
#18 0x00001555539ba64b in kj::Function<void (kj::Function<void ()>)>::Impl<kj::ExceptionCallback::RootExceptionCallback::getThreadInitializer()::{lambda(kj::Function<void ()>)#1}>::operator()(kj::Function<void ()>) ()
   from /usr/local/lib64/python3.12/site-packages/fusionsc/native.cpython-312-x86_64-linux-gnu.so
#19 0x00001555539cebc6 in kj::Thread::runThread(void*) () from /usr/local/lib64/python3.12/site-packages/fusionsc/native.cpython-312-x86_64-linux-gnu.so
#20 0x0000155554dd21f7 in start_thread () from /lib64/libc.so.6
#21 0x0000155554e5442c in clone3 () from /lib64/libc.so.6

Unfortunately the fusionsc library does not have debugging symbols enabled.

alexrobomind commented 3 months ago

Hi David. That's indeed not very far :(. That looks like it's trying to allocate excessive memory. Are you on the IPP linux VMs? Is this the 2.3.0 version?

alexrobomind commented 3 months ago

Aaaand turns out it's something completely different. When I recently changed the recording logic for field lines to always include the last point, I apparently made a mistake and introduced a buffer overflow (that required a very specific case to trigger). On Windows that didn't do anything (which is why it didn't crash for the example), but on Linux it corrupted the freelist link.

I think I have a fix, but I need to test it tomorrow before pushing a bugfix release. I will keep you updated.

alexrobomind commented 3 months ago

Hi David,

the 2.3.2 release should contain the fix. Could you update (pip install --upgrade fusionsc) and try again?

dschwoerer commented 3 months ago

Hi Alex, I did update and that resolved the issue. Thanks for the fast fix :tada: