kfrlib / kfr

Fast, modern C++ DSP framework, FFT, Sample Rate Conversion, FIR/IIR/Biquad Filters (SSE, AVX, AVX-512, ARM NEON)
https://www.kfrlib.com
GNU General Public License v2.0
1.62k stars 248 forks source link

undefined reference to `void kfr::sse2::dft_initialize #205

Closed xkzl closed 6 months ago

xkzl commented 6 months ago

Hello @dancazarin ,

I am trying to setup a docker environment "ubuntu22.04" including KFR5.2. I basically managed to compile and install it. I have designed a library OMU encapsulating some kfr methods.

When I try to connect my custom library in a cmake I get :

[ 92%] Built target GetEntries
/usr/bin/ld: ../lib/OMU/libOMU.so: undefined reference to `void kfr::sse2::dft_initialize<double>(kfr::dft_plan<double>&)'
/usr/bin/ld: ../lib/OMU/libOMU.so: undefined reference to `void kfr::sse2::dft_real_initialize<double>(kfr::dft_plan_real<double>&)'

Would you have any suggestion how to fix this ? This is not happening on macOS, but only with ubuntu22.04 (using GCC11 and C++17).

xkzl commented 6 months ago

It seems the problem get fixed when I enabled C API and uses clang instead of gcc. Do you have any clue why this is not happening on macos? (I don't enable C API on macOS, but i am using clang by default)

xkzl commented 6 months ago

Hi @dancazarin I am not sure this is related, but while preparing my docker environment installing kfr and enabling capi option.

I ended up with some memory issues. I could traceback up to kfr, but no further. I am wondering whether this docker container is not messing up the architecture detection, maybe ? This is maybe the reason i got this undefined reference tovoid kfr::sse2::dft_initialize ` initial issue that was not happening on my ARM macOS

Could you give me your opinion ?

Here is the basic code I am using (minimal reproducer, compiled with Clang 14):

#include <kfr/all.hpp>

int main() {

        std::vector<double> input = {2, 1.739320897989603321, 1.111585192241478115, 0.4651946789936177717, 0.08686620820818069522, 0.0002655974925263326642, -0.02775467909650402784, -0.2803625394325067188, -0.8450476981473055149, -1.520640638903657083, -1.950949887071400601, -1.884110337648723377, -1.359401810914456554, -0.6783948405585837893, -0.1824768515010283809, -0.007053706735352200004, -1.648065648398194479e-06, 0.1349731116370727346, 0.5828102020663453731, 1.248565342109722032, 1.807957217368342029, 1.940134795198463324, 1.565046483956072709, 0.9035783544276514423, 0.3142685529761814478, 0.03158868811086192052, 0.005847729973192756711, -0.0339825311613773523, -0.3451430443613818233, -0.9470403833054994447, -1.583067018304429308, -1.899767734308779099, 
-1.707105248653190577, -1.120553735280900964, -0.4759834468700876453, -0.08241995593086165472, -0.001843033411668155291, -0.02380995220001747709, 0.1476495484979366002, 0.6409879499056966301, 1.294975547260964532, 1.763524549667581987, 1.768537748147161359, 1.307360744592875212, 0.6557049980836907599, 0.1636171517239771722, 0.0006691861693925252883, 0.04504656884606737799, 4.009608726850157192e-05, -0.3538099960611205685, -0.9671239436514901255, -1.540120599675593338, -1.738616326262960143, -1.442617673474021478, -0.8368828077476384575, -0.2738093095562325607, -0.0135873653540896748, -0.04044781361531966934, -0.09483191091585801979, 0.105142820874979781, 0.625333835075665645, 1.245681739775967634, 1.614243214658051206, 1.507912083596089525, 
0.9999956661224340682, 0.4059336398723785155, 0.04860598279536287175, 0.02298352625783861569, 0.139944074195321233, 0.09089309181000310156, -0.2952469203305729817, -0.9021909541431394342, -1.400489565173038686, -1.489959898003710537, -1.124669105408877767, -0.5477393598695156074, -0.1090929340403639924, -0.005814388456486995968, -0.1439445460039573133, -0.2267643411384250851, -0.0001496804328339733782, 0.5353400686655153118, 1.110286691496064071, 1.382280383200549911, 1.192006447523911383, 0.6830002733063942344, 0.1930122233016666466, 0.0002655075168004892346, 0.1191855088698750842, 0.3018140241168703608, 0.2426430223038470169, -0.1720216412112436644, -0.7632948910358308137, -1.186186731172929498, -1.186863009297784544, -0.7933030053527594383, 
-0.2928886235632580659, -0.01408790495665835328, -0.07983941540372113677, -0.32175818914851817, -0.4203328116083723254, -0.1622646315032800768, 0.3840676702091681549, 0.9109677059972636215, 1.0997953174376629, 0.8602062280336781885, 0.3965152038749057417, 0.05022723960886350814, 0.0397863100425041652, 0.2974704968087940049, 0.5284315085803341638, 0.4458087572456688186, 0.0002915729722856205382, -0.573226470833971713, -0.9284508132084656751, -0.8675184009328303913, -0.4883275828033899479, -0.1062563444335613005, -0.01061700574003038126, -0.2432181864126915505, -0.5692551352464521042, -0.6626326063906402553, -0.362726488689640969, 0.1954378094992745007, 0.6782248283242355846, 0.8034232606099845908, 0.5512864429180411863, 0.1745466132404353277, 
3.750000254026371867e-11, 0.174574500241098135, 0.5514276210715076676, 0.8037730482777486474, 0.678796468659037755, 0.1961264841444000095, -0.3620935798432189934, -0.6621958297192190868, -0.5690459594299979162, -0.2431626230490116503, -0.01061378163628759475, -0.1062677193952378946, -0.4884117673409048566, -0.8677783455873464558, -0.9289393958950559194, -0.57388250175807487, -0.0003748782435482769619, 0.4452956641709209795, 0.5281479970371536492, 0.2973753826725218374, 0.03977595641660955361, 0.05023093234604798785, 0.3965561677698070109, 0.8603807791328722532, 1.10018652447662979, 0.9115611760747557302, 0.3847361702402604111, -0.1616934265338785814, -0.4199748001128623121, -0.3216128341036101346, -0.07981477945678633334, -0.01408925791060520495, 
-0.2929008666464609778, -0.7934026902671815762, -1.187150750346630534, -1.186692897764935806, -0.7639327698073674622, -0.1726263121436744929, 0.2422174343547515429, 0.3016108461399393259, 0.1191374217302022115, 0.0002658332251878870421, 0.1930090393869714482, 0.6830399302061813671, 1.192192783863664118, 1.382681599687336726, 1.110862851948544838, 0.5359489235208109159, 0.0003292931388187352948, -0.2265005691949465161, -0.1438634679802621164, -0.005811096900609521824, -0.1090848211520083211, -0.5477360547020913017, -1.124763378765599287, -1.490246819833502601, -1.400977109464180081, -0.9027724238930221468, -0.2957584341933354177, 0.09057201759866409518, 0.1398219393147618894, 0.0229708955918215274, 0.04859958245892709999, 0.4059045227439800896, 
1.000012956632289285, 1.508084048999398297, 1.614621677191439009, 1.246204626729205511, 0.6258517624661268375, 0.1055112241030421172, -0.09466392670415910149, -0.04041829528812360128, -0.01358504722098169068, -0.2737696651953872018, -0.8368419168514028561, -1.442682239930049182, -1.738873285944682712, -1.540556772698358445, -0.9676188893393139479, -0.3542091984003683525, -0.0001737452054680235336, 0.04499244910407947107, 0.0006693112623704244478, 0.1635788681411923517, 0.6556262001433073028, 1.307332454289188206, 1.768669659872286193, 1.763851396675544025, 1.295417304496725208, 0.6413957587874419275, 0.1479034609939437472, -0.0237251447271462669, -0.001840842993162925033, -0.08239057877232205951, -0.4758862725081852707, -1.12045226567931766, 
-1.707117419203634245, -1.899970116389768426, -1.583427219611244219, -0.9474305420160242797, -0.3454250992832720302, -0.03410080641030275445, 0.00583672780772816896, 0.03157115245176606716, 0.3141698555918571723, 0.903426274219513048, 1.56495221059336842, 1.940206299437091664, 1.80821190267722276, 1.248909684366706907, 0.5831027190560391649, 0.1351229980559817689, 2.471883173358129944e-05, -0.007046752307522965303, -0.1823893677741245667, -0.6782152295543568687, -1.359220791301259945, -1.884053701154217331, -1.951081717343044586, -1.52091158297334883, -0.8453283707555798721, -0.2805367572733507009, -0.02780127474116106043, 0.0002647878664472493948, 0.08679775869842103198, 0.4650088786726502832, 1.111341391684988578, 1.739147565261143447};

       kfr::dft_plan_real_ptr<double> dft = kfr::dft_cache::instance().getreal(kfr::ctype_t<double>(), input.size());
       std::vector<kfr::complex<double>> output(input.size(), std::numeric_limits<double>::quiet_NaN());
       std::vector<kfr::u8> temp(input.size());

       dft->execute(&output[0], &input[0], &temp[0]);

       output.resize(input.size() / 2 + 1);
       kfr::dft_cache::instance().clear();

       return 0;
}

Here is the valgrind log output:

==15644== Memcheck, a memory error detector
==15644== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==15644== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==15644== Command: ./tests/EnvelopeTest
==15644== 
==15644== Invalid write of size 8
==15644==    at 0x7FC91B3: void kfr::sse2::intrinsics::write<false, 32ul, double, (cometa::details::unique_enum_impl<371>::type)371>(cometa::cval_t<bool, false>, double*, kfr::sse2::vec<double, 32ul> const&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FC9120: kfr::sse2::vec<double, 32ul> const& kfr::sse2::vec<double, 32ul>::write<false>(double*, cometa::cval_t<bool, false>) const (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FC5E2A: void kfr::sse2::intrinsics::cwrite<16ul, false, double>(kfr::complex<double>*, kfr::sse2::vec<double, (16ul)*(2)> const&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD94DE: void kfr::sse2::intrinsics::radix4_autosort_pass_first<4ul, false, false, false, false, double>(unsigned long, cometa::cval_t<unsigned long, 4ul>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, kfr::complex<double> const*&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD900E: void kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute<false>(kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD8D8C: kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x10D226: void kfr::dft_plan<double>::execute_dft<false>(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) const (in /opt/omu/build/tests/EnvelopeTest)
==15644==    by 0x10BB24: main (in /opt/omu/build/tests/EnvelopeTest)
==15644==  Address 0xd8e2b20 is 176 bytes inside an unallocated block of size 1,336,688 in arena "client"
==15644== 
==15644== Invalid write of size 8
==15644==    at 0x7FC91BC: void kfr::sse2::intrinsics::write<false, 32ul, double, (cometa::details::unique_enum_impl<371>::type)371>(cometa::cval_t<bool, false>, double*, kfr::sse2::vec<double, 32ul> const&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FC9120: kfr::sse2::vec<double, 32ul> const& kfr::sse2::vec<double, 32ul>::write<false>(double*, cometa::cval_t<bool, false>) const (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FC5E2A: void kfr::sse2::intrinsics::cwrite<16ul, false, double>(kfr::complex<double>*, kfr::sse2::vec<double, (16ul)*(2)> const&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD94DE: void kfr::sse2::intrinsics::radix4_autosort_pass_first<4ul, false, false, false, false, double>(unsigned long, cometa::cval_t<unsigned long, 4ul>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, kfr::complex<double> const*&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD900E: void kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute<false>(kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD8D8C: kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x10D226: void kfr::dft_plan<double>::execute_dft<false>(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) const (in /opt/omu/build/tests/EnvelopeTest)
==15644==    by 0x10BB24: main (in /opt/omu/build/tests/EnvelopeTest)
==15644==  Address 0xd8e2b10 is 160 bytes inside an unallocated block of size 1,336,688 in arena "client"
==15644== 
==15644== Invalid write of size 8
==15644==    at 0x7FC91C5: void kfr::sse2::intrinsics::write<false, 32ul, double, (cometa::details::unique_enum_impl<371>::type)371>(cometa::cval_t<bool, false>, double*, kfr::sse2::vec<double, 32ul> const&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FC9120: kfr::sse2::vec<double, 32ul> const& kfr::sse2::vec<double, 32ul>::write<false>(double*, cometa::cval_t<bool, false>) const (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FC5E2A: void kfr::sse2::intrinsics::cwrite<16ul, false, double>(kfr::complex<double>*, kfr::sse2::vec<double, (16ul)*(2)> const&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD94DE: void kfr::sse2::intrinsics::radix4_autosort_pass_first<4ul, false, false, false, false, double>(unsigned long, cometa::cval_t<unsigned long, 4ul>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, kfr::complex<double> const*&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD900E: void kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute<false>(kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD8D8C: kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x10D226: void kfr::dft_plan<double>::execute_dft<false>(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) const (in /opt/omu/build/tests/EnvelopeTest)
==15644==    by 0x10BB24: main (in /opt/omu/build/tests/EnvelopeTest)
==15644==  Address 0xd8e2b00 is 144 bytes inside an unallocated block of size 1,336,688 in arena "client"
==15644== 
==15644== Invalid write of size 8
==15644==    at 0x7FC91CE: void kfr::sse2::intrinsics::write<false, 32ul, double, (cometa::details::unique_enum_impl<371>::type)371>(cometa::cval_t<bool, false>, double*, kfr::sse2::vec<double, 32ul> const&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FC9120: kfr::sse2::vec<double, 32ul> const& kfr::sse2::vec<double, 32ul>::write<false>(double*, cometa::cval_t<bool, false>) const (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FC5E2A: void kfr::sse2::intrinsics::cwrite<16ul, false, double>(kfr::complex<double>*, kfr::sse2::vec<double, (16ul)*(2)> const&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD94DE: void kfr::sse2::intrinsics::radix4_autosort_pass_first<4ul, false, false, false, false, double>(unsigned long, cometa::cval_t<unsigned long, 4ul>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, kfr::complex<double> const*&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD900E: void kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute<false>(kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD8D8C: kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x10D226: void kfr::dft_plan<double>::execute_dft<false>(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) const (in /opt/omu/build/tests/EnvelopeTest)
==15644==    by 0x10BB24: main (in /opt/omu/build/tests/EnvelopeTest)
==15644==  Address 0xd8e2af0 is 128 bytes inside an unallocated block of size 1,336,688 in arena "client"
==15644== 
==15644== Invalid write of size 8
==15644==    at 0x7FC91D7: void kfr::sse2::intrinsics::write<false, 32ul, double, (cometa::details::unique_enum_impl<371>::type)371>(cometa::cval_t<bool, false>, double*, kfr::sse2::vec<double, 32ul> const&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FC9120: kfr::sse2::vec<double, 32ul> const& kfr::sse2::vec<double, 32ul>::write<false>(double*, cometa::cval_t<bool, false>) const (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FC5E2A: void kfr::sse2::intrinsics::cwrite<16ul, false, double>(kfr::complex<double>*, kfr::sse2::vec<double, (16ul)*(2)> const&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD94DE: void kfr::sse2::intrinsics::radix4_autosort_pass_first<4ul, false, false, false, false, double>(unsigned long, cometa::cval_t<unsigned long, 4ul>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, kfr::complex<double> const*&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD900E: void kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute<false>(kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD8D8C: kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x10D226: void kfr::dft_plan<double>::execute_dft<false>(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) const (in /opt/omu/build/tests/EnvelopeTest)
==15644==    by 0x10BB24: main (in /opt/omu/build/tests/EnvelopeTest)
==15644==  Address 0xd8e2ae0 is 112 bytes inside an unallocated block of size 1,336,688 in arena "client"
==15644== 
==15644== Invalid write of size 8
==15644==    at 0x7FC91E0: void kfr::sse2::intrinsics::write<false, 32ul, double, (cometa::details::unique_enum_impl<371>::type)371>(cometa::cval_t<bool, false>, double*, kfr::sse2::vec<double, 32ul> const&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FC9120: kfr::sse2::vec<double, 32ul> const& kfr::sse2::vec<double, 32ul>::write<false>(double*, cometa::cval_t<bool, false>) const (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FC5E2A: void kfr::sse2::intrinsics::cwrite<16ul, false, double>(kfr::complex<double>*, kfr::sse2::vec<double, (16ul)*(2)> const&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD94DE: void kfr::sse2::intrinsics::radix4_autosort_pass_first<4ul, false, false, false, false, double>(unsigned long, cometa::cval_t<unsigned long, 4ul>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, kfr::complex<double> const*&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD900E: void kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute<false>(kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD8D8C: kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x10D226: void kfr::dft_plan<double>::execute_dft<false>(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) const (in /opt/omu/build/tests/EnvelopeTest)
==15644==    by 0x10BB24: main (in /opt/omu/build/tests/EnvelopeTest)
==15644==  Address 0xd8e2ad0 is 96 bytes inside an unallocated block of size 1,336,688 in arena "client"
==15644== 
==15644== Invalid write of size 8
==15644==    at 0x7FC91E9: void kfr::sse2::intrinsics::write<false, 32ul, double, (cometa::details::unique_enum_impl<371>::type)371>(cometa::cval_t<bool, false>, double*, kfr::sse2::vec<double, 32ul> const&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FC9120: kfr::sse2::vec<double, 32ul> const& kfr::sse2::vec<double, 32ul>::write<false>(double*, cometa::cval_t<bool, false>) const (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FC5E2A: void kfr::sse2::intrinsics::cwrite<16ul, false, double>(kfr::complex<double>*, kfr::sse2::vec<double, (16ul)*(2)> const&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD94DE: void kfr::sse2::intrinsics::radix4_autosort_pass_first<4ul, false, false, false, false, double>(unsigned long, cometa::cval_t<unsigned long, 4ul>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, kfr::complex<double> const*&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD900E: void kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute<false>(kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD8D8C: kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x10D226: void kfr::dft_plan<double>::execute_dft<false>(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) const (in /opt/omu/build/tests/EnvelopeTest)
==15644==    by 0x10BB24: main (in /opt/omu/build/tests/EnvelopeTest)
==15644==  Address 0xd8e2ac0 is 80 bytes inside an unallocated block of size 1,336,688 in arena "client"
==15644== 
==15644== Invalid write of size 8
==15644==    at 0x7FC91F2: void kfr::sse2::intrinsics::write<false, 32ul, double, (cometa::details::unique_enum_impl<371>::type)371>(cometa::cval_t<bool, false>, double*, kfr::sse2::vec<double, 32ul> const&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FC9120: kfr::sse2::vec<double, 32ul> const& kfr::sse2::vec<double, 32ul>::write<false>(double*, cometa::cval_t<bool, false>) const (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FC5E2A: void kfr::sse2::intrinsics::cwrite<16ul, false, double>(kfr::complex<double>*, kfr::sse2::vec<double, (16ul)*(2)> const&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD94DE: void kfr::sse2::intrinsics::radix4_autosort_pass_first<4ul, false, false, false, false, double>(unsigned long, cometa::cval_t<unsigned long, 4ul>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, kfr::complex<double> const*&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD900E: void kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute<false>(kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD8D8C: kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x10D226: void kfr::dft_plan<double>::execute_dft<false>(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) const (in /opt/omu/build/tests/EnvelopeTest)
==15644==    by 0x10BB24: main (in /opt/omu/build/tests/EnvelopeTest)
==15644==  Address 0xd8e2ab0 is 64 bytes inside an unallocated block of size 1,336,688 in arena "client"
==15644== 
==15644== Invalid write of size 8
==15644==    at 0x7FC91FB: void kfr::sse2::intrinsics::write<false, 32ul, double, (cometa::details::unique_enum_impl<371>::type)371>(cometa::cval_t<bool, false>, double*, kfr::sse2::vec<double, 32ul> const&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FC9120: kfr::sse2::vec<double, 32ul> const& kfr::sse2::vec<double, 32ul>::write<false>(double*, cometa::cval_t<bool, false>) const (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FC5E2A: void kfr::sse2::intrinsics::cwrite<16ul, false, double>(kfr::complex<double>*, kfr::sse2::vec<double, (16ul)*(2)> const&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD94DE: void kfr::sse2::intrinsics::radix4_autosort_pass_first<4ul, false, false, false, false, double>(unsigned long, cometa::cval_t<unsigned long, 4ul>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, kfr::complex<double> const*&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD900E: void kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute<false>(kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD8D8C: kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x10D226: void kfr::dft_plan<double>::execute_dft<false>(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) const (in /opt/omu/build/tests/EnvelopeTest)
==15644==    by 0x10BB24: main (in /opt/omu/build/tests/EnvelopeTest)
==15644==  Address 0xd8e2aa0 is 48 bytes inside an unallocated block of size 1,336,688 in arena "client"
==15644== 
==15644== Invalid write of size 8
==15644==    at 0x7FC9200: void kfr::sse2::intrinsics::write<false, 32ul, double, (cometa::details::unique_enum_impl<371>::type)371>(cometa::cval_t<bool, false>, double*, kfr::sse2::vec<double, 32ul> const&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FC9120: kfr::sse2::vec<double, 32ul> const& kfr::sse2::vec<double, 32ul>::write<false>(double*, cometa::cval_t<bool, false>) const (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FC5E2A: void kfr::sse2::intrinsics::cwrite<16ul, false, double>(kfr::complex<double>*, kfr::sse2::vec<double, (16ul)*(2)> const&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD94DE: void kfr::sse2::intrinsics::radix4_autosort_pass_first<4ul, false, false, false, false, double>(unsigned long, cometa::cval_t<unsigned long, 4ul>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, kfr::complex<double> const*&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD900E: void kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute<false>(kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD8D8C: kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x10D226: void kfr::dft_plan<double>::execute_dft<false>(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) const (in /opt/omu/build/tests/EnvelopeTest)
==15644==    by 0x10BB24: main (in /opt/omu/build/tests/EnvelopeTest)
==15644==  Address 0xd8e2a90 is 32 bytes inside an unallocated block of size 1,336,688 in arena "client"
==15644== 
==15644== Invalid write of size 8
==15644==    at 0x7FC9205: void kfr::sse2::intrinsics::write<false, 32ul, double, (cometa::details::unique_enum_impl<371>::type)371>(cometa::cval_t<bool, false>, double*, kfr::sse2::vec<double, 32ul> const&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FC9120: kfr::sse2::vec<double, 32ul> const& kfr::sse2::vec<double, 32ul>::write<false>(double*, cometa::cval_t<bool, false>) const (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FC5E2A: void kfr::sse2::intrinsics::cwrite<16ul, false, double>(kfr::complex<double>*, kfr::sse2::vec<double, (16ul)*(2)> const&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD94DE: void kfr::sse2::intrinsics::radix4_autosort_pass_first<4ul, false, false, false, false, double>(unsigned long, cometa::cval_t<unsigned long, 4ul>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, kfr::complex<double> const*&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD900E: void kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute<false>(kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD8D8C: kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x10D226: void kfr::dft_plan<double>::execute_dft<false>(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) const (in /opt/omu/build/tests/EnvelopeTest)
==15644==    by 0x10BB24: main (in /opt/omu/build/tests/EnvelopeTest)
==15644==  Address 0xd8e2a80 is 16 bytes inside an unallocated block of size 1,336,688 in arena "client"
==15644== 
==15644== Invalid write of size 8
==15644==    at 0x7FC920A: void kfr::sse2::intrinsics::write<false, 32ul, double, (cometa::details::unique_enum_impl<371>::type)371>(cometa::cval_t<bool, false>, double*, kfr::sse2::vec<double, 32ul> const&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FC9120: kfr::sse2::vec<double, 32ul> const& kfr::sse2::vec<double, 32ul>::write<false>(double*, cometa::cval_t<bool, false>) const (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FC5E2A: void kfr::sse2::intrinsics::cwrite<16ul, false, double>(kfr::complex<double>*, kfr::sse2::vec<double, (16ul)*(2)> const&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD94DE: void kfr::sse2::intrinsics::radix4_autosort_pass_first<4ul, false, false, false, false, double>(unsigned long, cometa::cval_t<unsigned long, 4ul>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, kfr::complex<double> const*&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD900E: void kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute<false>(kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD8D8C: kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x10D226: void kfr::dft_plan<double>::execute_dft<false>(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) const (in /opt/omu/build/tests/EnvelopeTest)
==15644==    by 0x10BB24: main (in /opt/omu/build/tests/EnvelopeTest)
==15644==  Address 0xd8e2a70 is 0 bytes inside an unallocated block of size 1,336,688 in arena "client"
==15644== 
==15644== Invalid write of size 8
==15644==    at 0x7FC920F: void kfr::sse2::intrinsics::write<false, 32ul, double, (cometa::details::unique_enum_impl<371>::type)371>(cometa::cval_t<bool, false>, double*, kfr::sse2::vec<double, 32ul> const&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FC9120: kfr::sse2::vec<double, 32ul> const& kfr::sse2::vec<double, 32ul>::write<false>(double*, cometa::cval_t<bool, false>) const (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FC5E2A: void kfr::sse2::intrinsics::cwrite<16ul, false, double>(kfr::complex<double>*, kfr::sse2::vec<double, (16ul)*(2)> const&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD94DE: void kfr::sse2::intrinsics::radix4_autosort_pass_first<4ul, false, false, false, false, double>(unsigned long, cometa::cval_t<unsigned long, 4ul>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, kfr::complex<double> const*&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD900E: void kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute<false>(kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD8D8C: kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x10D226: void kfr::dft_plan<double>::execute_dft<false>(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) const (in /opt/omu/build/tests/EnvelopeTest)
==15644==    by 0x10BB24: main (in /opt/omu/build/tests/EnvelopeTest)
==15644==  Address 0xd8e2a60 is 16 bytes before an unallocated block of size 1,336,688 in arena "client"
==15644== 
==15644== Invalid write of size 8
==15644==    at 0x7FC9214: void kfr::sse2::intrinsics::write<false, 32ul, double, (cometa::details::unique_enum_impl<371>::type)371>(cometa::cval_t<bool, false>, double*, kfr::sse2::vec<double, 32ul> const&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FC9120: kfr::sse2::vec<double, 32ul> const& kfr::sse2::vec<double, 32ul>::write<false>(double*, cometa::cval_t<bool, false>) const (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FC5E2A: void kfr::sse2::intrinsics::cwrite<16ul, false, double>(kfr::complex<double>*, kfr::sse2::vec<double, (16ul)*(2)> const&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD94DE: void kfr::sse2::intrinsics::radix4_autosort_pass_first<4ul, false, false, false, false, double>(unsigned long, cometa::cval_t<unsigned long, 4ul>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, kfr::complex<double> const*&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD900E: void kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute<false>(kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD8D8C: kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x10D226: void kfr::dft_plan<double>::execute_dft<false>(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) const (in /opt/omu/build/tests/EnvelopeTest)
==15644==    by 0x10BB24: main (in /opt/omu/build/tests/EnvelopeTest)
==15644==  Address 0xd8e2a50 is 32 bytes before an unallocated block of size 1,336,688 in arena "client"
==15644== 
==15644== Invalid write of size 8
==15644==    at 0x7FC9219: void kfr::sse2::intrinsics::write<false, 32ul, double, (cometa::details::unique_enum_impl<371>::type)371>(cometa::cval_t<bool, false>, double*, kfr::sse2::vec<double, 32ul> const&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FC9120: kfr::sse2::vec<double, 32ul> const& kfr::sse2::vec<double, 32ul>::write<false>(double*, cometa::cval_t<bool, false>) const (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FC5E2A: void kfr::sse2::intrinsics::cwrite<16ul, false, double>(kfr::complex<double>*, kfr::sse2::vec<double, (16ul)*(2)> const&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD94DE: void kfr::sse2::intrinsics::radix4_autosort_pass_first<4ul, false, false, false, false, double>(unsigned long, cometa::cval_t<unsigned long, 4ul>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, kfr::complex<double> const*&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD900E: void kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute<false>(kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD8D8C: kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x10D226: void kfr::dft_plan<double>::execute_dft<false>(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) const (in /opt/omu/build/tests/EnvelopeTest)
==15644==    by 0x10BB24: main (in /opt/omu/build/tests/EnvelopeTest)
==15644==  Address 0xd8e2a40 is 16 bytes after a block of size 256 alloc'd
==15644==    at 0x4849013: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==15644==    by 0x10BAB4: main (in /opt/omu/build/tests/EnvelopeTest)
==15644== 
==15644== Invalid write of size 8
==15644==    at 0x7FC921E: void kfr::sse2::intrinsics::write<false, 32ul, double, (cometa::details::unique_enum_impl<371>::type)371>(cometa::cval_t<bool, false>, double*, kfr::sse2::vec<double, 32ul> const&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FC9120: kfr::sse2::vec<double, 32ul> const& kfr::sse2::vec<double, 32ul>::write<false>(double*, cometa::cval_t<bool, false>) const (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FC5E2A: void kfr::sse2::intrinsics::cwrite<16ul, false, double>(kfr::complex<double>*, kfr::sse2::vec<double, (16ul)*(2)> const&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD94DE: void kfr::sse2::intrinsics::radix4_autosort_pass_first<4ul, false, false, false, false, double>(unsigned long, cometa::cval_t<unsigned long, 4ul>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, kfr::complex<double> const*&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD900E: void kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute<false>(kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD8D8C: kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x10D226: void kfr::dft_plan<double>::execute_dft<false>(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) const (in /opt/omu/build/tests/EnvelopeTest)
==15644==    by 0x10BB24: main (in /opt/omu/build/tests/EnvelopeTest)
==15644==  Address 0xd8e2a30 is 0 bytes after a block of size 256 alloc'd
==15644==    at 0x4849013: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==15644==    by 0x10BAB4: main (in /opt/omu/build/tests/EnvelopeTest)
==15644== 

valgrind: m_mallocfree.c:303 (get_bszB_as_is): Assertion 'bszB_lo == bszB_hi' failed.
valgrind: Heap block lo/hi size mismatch: lo = 320, hi = 4600849206870085994.
This is probably caused by your program erroneously writing past the
end of a heap block and corrupting heap metadata.  If you fix any
invalid writes reported by Memcheck, this assertion failure will
probably go away.  Please try that before reporting this as a bug.

host stacktrace:
==15644==    at 0x5804284A: ??? (in /usr/libexec/valgrind/memcheck-amd64-linux)
==15644==    by 0x58042977: ??? (in /usr/libexec/valgrind/memcheck-amd64-linux)
==15644==    by 0x58042B1B: ??? (in /usr/libexec/valgrind/memcheck-amd64-linux)
==15644==    by 0x5804C8CF: ??? (in /usr/libexec/valgrind/memcheck-amd64-linux)
==15644==    by 0x5803AE9A: ??? (in /usr/libexec/valgrind/memcheck-amd64-linux)
==15644==    by 0x580395B7: ??? (in /usr/libexec/valgrind/memcheck-amd64-linux)
==15644==    by 0x5803DF3D: ??? (in /usr/libexec/valgrind/memcheck-amd64-linux)
==15644==    by 0x58038868: ??? (in /usr/libexec/valgrind/memcheck-amd64-linux)
==15644==    by 0x1008D5A766: ???

sched status:
  running_tid=1

Thread 1: status = VgTs_Runnable (lwpid 15644)
==15644==    at 0x7F3B94C: _ZN3kfr4sse210intrinsics9simd_readILm8EdEEDvT__NS_16internal_generic10unwrap_bitIT0_E4typeEPKS5_ (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7F3B8DD: kfr::sse2::vec<double, 8ul> kfr::sse2::intrinsics::read<8ul, false, double, (cometa::details::unique_enum_impl<356>::type)356>(cometa::cval_t<bool, false>, cometa::cval_t<unsigned long, 8ul>, double const*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7F5A57A: kfr::sse2::vec<double, 8ul>::vec<false>(double const*, cometa::cval_t<bool, false>) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FDA7B6: kfr::sse2::vec<double, (4ul)*(2)> kfr::sse2::intrinsics::cread_split<4ul, false, false, double>(kfr::complex<double> const*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD963B: void kfr::sse2::intrinsics::radix4_autosort_pass<4ul, false, false, false, false, double>(unsigned long, unsigned long, cometa::cval_t<unsigned long, 4ul>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, kfr::complex<double> const*&) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD9029: void kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute<false>(kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x7FD8D8C: kfr::sse2::intrinsics::fft_specialization<double, 7ul>::do_execute(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) (in /usr/local/lib/libkfr_capi.so)
==15644==    by 0x10D226: void kfr::dft_plan<double>::execute_dft<false>(cometa::cval_t<bool, false>, kfr::complex<double>*, kfr::complex<double> const*, unsigned char*) const (in /opt/omu/build/tests/EnvelopeTest)
==15644==    by 0x10BB24: main (in /opt/omu/build/tests/EnvelopeTest)
client stack range: [0x1FFEFFD000 0x1FFF000FFF] client SP: 0x1FFEFFF2B0
valgrind stack range: [0x1002CAE000 0x1002DADFFF] top usage: 18984 of 1048576

Note: see also the FAQ in the source distribution.
It contains workarounds to several common problems.
In particular, if Valgrind aborted or crashed after
identifying problems in your program, there's a good chance
that fixing those problems will prevent Valgrind aborting or
crashing, especially if it happened in m_mallocfree.c.

If that doesn't help, please report this bug to: www.valgrind.org

In the bug report, send all the above text, the valgrind
version, and what OS and version you are using.  Thanks.
dancazarin commented 6 months ago

Hello @xkzl temp array must be allocated with dft->temp_size bytes, not the size of DFT. If the array is smaller than temp_size (as in your case) kfr tries to write after the allocated memory block, this is exactly what valgrind detects. Example from kfr repo: https://github.com/kfrlib/kfr/blob/d096043b2b1edcd0beffbc2460bc07f67762d92c/examples/dft.cpp#L31

Please take a look at the following doc section about building and linking KFR https://kfr.dev/docs/latest/installation/#compile-for-multiple-architectures Note that building DFT on GCC (and MSVC) is not officially supported. But using clang-compiled DFT libraries (static or shared) is perfectly fine in any compiler, ABI is compatible.

xkzl commented 6 months ago

Ah.. indeed, it's now working ! Thank you for your explanation ! This was very helpful.

xkzl commented 6 months ago

Sorry for reopening, I just tried the following:

#include <kfr/all.hpp>
int main() {

    int n = 128;

    kfr::dft_plan<double> dft(n);
    std::cout << dft.temp_size << std::endl;

    return 0;  
}

Is there any reason why this is printing 0?

dancazarin commented 6 months ago

Please replace the code above with this and rerun:

#include <kfr/all.hpp>
int main() {
    int n = 128;
    kfr::println(kfr::library_version()); // print KFR version
    kfr::dft_plan<double> dft(n);
    dft.dump(); // print selected DFT algorithm
    std::cout << dft.temp_size << std::endl;

    return 0;  
}

What does it print now?

xkzl commented 6 months ago

Hi @dancazarin thank you for replying so quickly. I was going through some tests and I noticed some conflicts in two of my kfr installation.

I am suspecting a conflict due to a bad configuration on my side. It is now returning something that is not 0.

KFR 5.2.0 neon64 64-bit (clang-15.0.0/macos) +in +ve
fft_specialization<double, 7>(neon64): 0, 128, 2048, 2048, 1, 0, 0, 0, 1
2048
dancazarin commented 6 months ago

0 in temp_size is ok as well as allocating zero sized univector (and std::vector). Some DFT sizes require no temp buffer and have 0 in temp_size. But for double precision DFT of size 128 KFR 5.2 must return 2048 while previous KFR returned zero. That's why I thought about wrong KFR version.

xkzl commented 6 months ago

Hi @dancazarin

I was doing further tests yesterday. Your suggestions triggered some interested to understand the underlying mechanism of KFR you mentioned. Thank you for the explanation ! I compared many dft plan declarations.

#include <iostream>
#include <kfr/all.hpp>

int main() {

    kfr::println(kfr::library_version()); // print KFR version

    int n = 16384;
    for(int i = 0; i <= n; i++) {

        kfr::dft_plan<double> dft(i);
        std::cout << ">>>> i = " << i << ": dft.temp_size = " << dft.temp_size << std::endl;
        if(!dft.temp_size) {
            std::cout << ">>>>";
            dft.dump(); // print selected DFT algorithm
        }

    }

    return 0;  
}

I ran this code above and it resulted in the following output. dft_plan.log Here is the summary

KFR 5.2.0 neon64 64-bit (clang-15.0.0/macos) +in +ve
>>>> i = 0: dft.temp_size = 0
>>>>>>>> i = 1: dft.temp_size = 0
>>>>fft_specialization<double, 0>(neon64): 0, 1, 0, 0, 1, 0, 0, 0, 1
>>>> i = 2: dft.temp_size = 0
>>>>fft_specialization<double, 1>(neon64): 0, 2, 0, 0, 1, 0, 0, 0, 1
>>>> i = 3: dft.temp_size = 64
>>>> i = 4: dft.temp_size = 0
>>>>fft_specialization<double, 2>(neon64): 0, 4, 0, 0, 1, 0, 0, 0, 1
>>>> i = 5: dft.temp_size = 128
>>>> i = 6: dft.temp_size = 128
>>>> i = 7: dft.temp_size = 128
>>>> i = 8: dft.temp_size = 0
>>>>fft_specialization<double, 3>(neon64): 0, 8, 0, 0, 1, 0, 0, 0, 1
>>>> i = 9: dft.temp_size = 192
>>>> i = 10: dft.temp_size = 192
[..]
>>>> i = 15: dft.temp_size = 256
>>>> i = 16: dft.temp_size = 0
>>>>fft_specialization<double, 4>(neon64): 0, 16, 0, 0, 1, 0, 0, 0, 1
>>>> i = 31: dft.temp_size = 1024
>>>> i = 32: dft.temp_size = 0
>>>>fft_specialization<double, 5>(neon64): 0, 32, 0, 0, 1, 0, 0, 0, 1
>>>> i = 64: dft.temp_size = 0
>>>>fft_specialization<double, 6>(neon64): 0, 64, 0, 0, 1, 0, 0, 0, 1
>>>> i = 1024: dft.temp_size = 16384
>>>> i = 1025: dft.temp_size = 17152
[..]
>>>> i = 2048: dft.temp_size = 32768
>>>> i = 4096: dft.temp_size = 0
>>>>fft_stage_impl<double, false, true>(neon64): 4, 4096, 49152, 0, 4, 0, 0, 1, 1
fft_final_stage_impl<double, true, 1024>(neon64): 1024, 1024, 24576, 0, 4, 1024, 0, 1, 1
fft_reorder_stage_impl<double, true>(neon64): 0, 4096, 0, 0, 1, 0, 0, 0, 1
[..]
>>>> i = 8191: dft.temp_size = 131072
>>>> i = 8192: dft.temp_size = 0
>>>>fft_stage_impl<double, false, false>(neon64): 4, 8192, 98304, 0, 4, 0, 0, 1, 1
fft_stage_impl<double, true, false>(neon64): 4, 2048, 24576, 0, 4, 0, 0, 1, 1
fft_final_stage_impl<double, true, 512>(neon64): 512, 512, 12288, 0, 4, 512, 0, 1, 1
fft_reorder_stage_impl<double, false>(neon64): 0, 8192, 0, 0, 1, 0, 0, 0, 1
[..]
>>>> i = 16384: dft.temp_size = 0
>>>>fft_stage_impl<double, false, true>(neon64): 4, 16384, 196608, 0, 4, 0, 0, 1, 1
fft_stage_impl<double, true, true>(neon64): 4, 4096, 49152, 0, 4, 0, 0, 1, 1
fft_final_stage_impl<double, true, 1024>(neon64): 1024, 1024, 24576, 0, 4, 1024, 0, 1, 1
fft_reorder_stage_impl<double, true>(neon64): 0, 16384, 0, 0, 1, 0, 0, 0, 1

I noticed there are indeed many plans returning 0 temp_size. Is there actually a way to make the temp variable optional in the dft_plan->execute() method maybe ? Looking at fftw I think there are not using such prototype. They actually even have an "inPlace" option providing only an input vector without the need of output vector.

dancazarin commented 6 months ago

Hi @xkzl , KFR also supports inplace mode. Just pass the same pointer to in and out parameters as shown here: https://github.com/kfrlib/kfr/blob/d096043b2b1edcd0beffbc2460bc07f67762d92c/tests/dft_test.cpp#L71

temp parameter is optional in KFR6. Passing nullptr will result in buffer allocation during dft execution.

The problem of undefined reference is resolved, so I'll close this as completed. If you have further questions about using KFR please open another topic and start subject with Question.