Open jthet opened 9 months ago
Thanks for reporting this issue, which I wasn't aware of previously.
The progress messages you mentioned are from the following Cython function: https://github.com/uafgeotools/mtuq/blob/master/mtuq/misfit/waveform/c_ext_L2.c
In this function, it appears that most or all of the memory allocation/deallocation occurs through the Numpy API.
To start, it is probably worth double checking the NumPy API is being used correctly.
Also, it may be worth double checking this module intialization by comparing it against the Cython docs.
I am hoping that a software developer at my workplace might be able to start looking at the issue in October, but anyone is welcome to try troubleshooting.
In the meantime, if you create the misfit function usingWaveformMisfit(optimization_level=1, ...)
, then mtuq falls back to a slower pure Python implementation in which the Cython extensions are not called.
As expected for such a generic error message, free(): invalid pointer
brings a very large number of stackoverflow and other search results.
Interestingly though, many of the top results appear to be Cython related, including a still apparently unresolved Pytorch issue, for example.
I've been getting errors when running the MTUQ container on TACC's frontera through apptainer. The errors have been indeterminant, however have always happened after the third "about 75 percent finished" message. See below for the std out, but I have also gotten error like
malloc(): invalid size (unsorted)
,double free or corruption (out)
,corrupted size vs. prev_size in fastbins
The sif image was freshly pulled and it is the newest version.