NVlabs / timeloop

Timeloop performs modeling, mapping and code-generation for tensor algebra workloads on various accelerator architectures.
https://timeloop.csail.mit.edu/
BSD 3-Clause "New" or "Revised" License
340 stars 104 forks source link

Double free or corruption error while running timeloop #166

Closed MustafaFayez closed 2 years ago

MustafaFayez commented 2 years ago

I see this error while running timeloop:

Summary stats for best mapping found by mapper:
  Utilization = 1.00 | pJ/Compute =    5.811
*** Error in `timeloop-mapper': double free or corruption (out): 0x0000000001343870 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x81329)[0x7f6ced5ef329]
timeloop-mapper(_ZSt8_DestroyINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEvPT_+0x18)[0x9431b3]
timeloop-mapper(_ZNSt12_Destroy_auxILb0EE9__destroyIPNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEvT_S9_+0x26)[0x93fd0a]
timeloop-mapper(_ZSt8_DestroyIPNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEvT_S7_+0x23)[0x93b712]
timeloop-mapper(_ZSt8_DestroyIPNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES5_EvT_S7_RSaIT0_E+0x27)[0x9369b7]
timeloop-mapper(_ZNSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS5_EED1Ev+0x35)[0x932a1f]
/lib64/libc.so.6(__cxa_finalize+0x9a)[0x7f6ced5a805a]
/home/nano01/a/sharm418/gp-cim/timeloop_mustafa/accelergy-timeloop-infrastructure/src/timeloop/build/libtimeloop-mapper.so(+0x729493)[0x7f6cee5c0493]
======= Memory map: ========
00400000-00928000 r--p 00000000 00:36 8790280                            /home/msee286pc10/a/sharm418/anaconda3/envs/py39/bin/timeloop-mapper
00928000-00bdb000 r-xp 00528000 00:36 8790280                            /home/msee286pc10/a/sharm418/anaconda3/envs/py39/bin/timeloop-mapper
00bdb000-00def000 r--p 007db000 00:36 8790280                            /home/msee286pc10/a/sharm418/anaconda3/envs/py39/bin/timeloop-mapper
00def000-00df0000 r--p 009ef000 00:36 8790280                            /home/msee286pc10/a/sharm418/anaconda3/envs/py39/bin/timeloop-mapper
00df0000-00df2000 rw-p 009f0000 00:36 8790280                            /home/msee286pc10/a/sharm418/anaconda3/envs/py39/bin/timeloop-mapper
00df2000-00df9000 rw-p 00000000 00:00 0 
012ab000-01688000 rw-p 00000000 00:00 0                                  [heap]
7f6cc0000000-7f6cc0082000 rw-p 00000000 00:00 0 
7f6cc0082000-7f6cc4000000 ---p 00000000 00:00 0 
7f6cc4000000-7f6cc4082000 rw-p 00000000 00:00 0 
7f6cc4082000-7f6cc8000000 ---p 00000000 00:00 0 
7f6cc8000000-7f6cc8081000 rw-p 00000000 00:00 0 
7f6cc8081000-7f6ccc000000 ---p 00000000 00:00 0 
7f6ccc000000-7f6ccc081000 rw-p 00000000 00:00 0 
7f6ccc081000-7f6cd0000000 ---p 00000000 00:00 0 
7f6cd4000000-7f6cd4082000 rw-p 00000000 00:00 0 
7f6cd4082000-7f6cd8000000 ---p 00000000 00:00 0 
7f6cd8000000-7f6cd8081000 rw-p 00000000 00:00 0 
7f6cd8081000-7f6cdc000000 ---p 00000000 00:00 0 
7f6cdc000000-7f6cdc081000 rw-p 00000000 00:00 0 
7f6cdc081000-7f6ce0000000 ---p 00000000 00:00 0 
7f6ce4000000-7f6ce4081000 rw-p 00000000 00:00 0 
7f6ce4081000-7f6ce8000000 ---p 00000000 00:00 0 
7f6ce94e4000-7f6ce94e5000 ---p 00000000 00:00 0 
7f6ce94e5000-7f6ce9ce5000 rw-p 00000000 00:00 0 
7f6ce9ce5000-7f6ce9ce6000 ---p 00000000 00:00 0 
7f6ce9ce6000-7f6cea4e6000 rw-p 00000000 00:00 0 
7f6cea4e6000-7f6cea4e7000 ---p 00000000 00:00 0 
7f6cea4e7000-7f6ceace7000 rw-p 00000000 00:00 0 
7f6ceace7000-7f6ceace8000 ---p 00000000 00:00 0 
7f6ceace8000-7f6ceb4e8000 rw-p 00000000 00:00 0 
7f6cecceb000-7f6cecd2e000 rw-p 00000000 00:00 0 
7f6cecd2e000-7f6cecd31000 r--p 00000000 00:36 11672155                   /home/msee286pc10/a/sharm418/anaconda3/envs/py39/lib/libz.so.1.2.11
7f6cecd31000-7f6cecd3f000 r-xp 00003000 00:36 11672155                   /home/msee286pc10/a/sharm418/anaconda3/envs/py39/lib/libz.so.1.2.11
7f6cecd3f000-7f6cecd45000 r--p 00011000 00:36 11672155                   /home/msee286pc10/a/sharm418/anaconda3/envs/py39/lib/libz.so.1.2.11
7f6cecd45000-7f6cecd46000 ---p 00017000 00:36 11672155                   /home/msee286pc10/a/sharm418/anaconda3/envs/py39/lib/libz.so.1.2.11
7f6cecd46000-7f6cecd47000 r--p 00017000 00:36 11672155                   /home/msee286pc10/a/sharm418/anaconda3/envs/py39/lib/libz.so.1.2.11
7f6cecd47000-7f6cecd48000 rw-p 00018000 00:36 11672155                   /home/msee286pc10/a/sharm418/anaconda3/envs/py39/lib/libz.so.1.2.11
7f6cecd48000-7f6cecd4f000 r-xp 00000000 fd:00 50366842                   /usr/lib64/librt-2.17.so
7f6cecd4f000-7f6cecf4e000 ---p 00007000 fd:00 50366842                   /usr/lib64/librt-2.17.so
7f6cecf4e000-7f6cecf4f000 r--p 00006000 fd:00 50366842                   /usr/lib64/librt-2.17.so
7f6cecf4f000-7f6cecf50000 rw-p 00007000 fd:00 50366842                   /usr/lib64/librt-2.17.so
7f6cecf50000-7f6cecf67000 r-xp 00000000 00:3a 274708182                  /package/gcc/11.2.0/lib64/libgcc_s.so.1
7f6cecf67000-7f6ced166000 ---p 00017000 00:3a 274708182                  /package/gcc/11.2.0/lib64/libgcc_s.so.1
7f6ced166000-7f6ced167000 r--p 00016000 00:3a 274708182                  /package/gcc/11.2.0/lib64/libgcc_s.so.1
7f6ced167000-7f6ced168000 rw-p 00017000 00:3a 274708182                  /package/gcc/11.2.0/lib64/libgcc_s.so.1
7f6ced168000-7f6ced35e000 r-xp 00000000 00:3a 274708189                  /package/gcc/11.2.0/lib64/libstdc++.so.6
7f6ced35e000-7f6ced55d000 ---p 001f6000 00:3a 274708189                  /package/gcc/11.2.0/lib64/libstdc++.so.6
7f6ced55d000-7f6ced568000 r--p 001f5000 00:3a 274708189                  /package/gcc/11.2.0/lib64/libstdc++.so.6
7f6ced568000-7f6ced56b000 rw-p 00200000 00:3a 274708189                  /package/gcc/11.2.0/lib64/libstdc++.so.6
7f6ced56b000-7f6ced56e000 rw-p 00000000 00:00 0 
7f6ced56e000-7f6ced732000 r-xp 00000000 fd:00 50366723                   /usr/lib64/libc-2.17.so
7f6ced732000-7f6ced931000 ---p 001c4000 fd:00 50366723                   /usr/lib64/libc-2.17.so
7f6ced931000-7f6ced935000 r--p 001c3000 fd:00 50366723                   /usr/lib64/libc-2.17.so
7f6ced935000-7f6ced937000 rw-p 001c7000 fd:00 50366723                   /usr/lib64/libc-2.17.so
7f6ced937000-7f6ced93c000 rw-p 00000000 00:00 0 
7f6ced93c000-7f6ced953000 r-xp 00000000 fd:00 50366838                   /usr/lib64/libpthread-2.17.so
7f6ced953000-7f6cedb52000 ---p 00017000 fd:00 50366838                   /usr/lib64/libpthread-2.17.so
7f6cedb52000-7f6cedb53000 r--p 00016000 fd:00 50366838                   /usr/lib64/libpthread-2.17.so
7f6cedb53000-7f6cedb54000 rw-p 00017000 fd:00 50366838                   /usr/lib64/libpthread-2.17.so
7f6cedb54000-7f6cedb58000 rw-p 00000000 00:00 0 
7f6cedb58000-7f6cedc59000 r-xp 00000000 fd:00 50366820                   /usr/lib64/libm-2.17.so
7f6cedc59000-7f6cede58000 ---p 00101000 fd:00 50366820                   /usr/lib64/libm-2.17.so
7f6cede58000-7f6cede59000 r--p 00100000 fd:00 50366820                   /usr/lib64/libm-2.17.so
7f6cede59000-7f6cede5a000 rw-p 00101000 fd:00 50366820                   /usr/lib64/libm-2.17.so
7f6cede95000-7f6cede97000 rw-p 00000000 00:00 0 
7f6cede97000-7f6cee550000 r--p 00000000 00:38 163508151940               /home/nano01/a/sharm418/gp-cim/timeloop_mustafa/accelergy-timeloop-infrastructure/src/timeloop/build/libtimeloop-mapper.so
7f6cee550000-7f6cee939000 r-xp 006b9000 00:38 163508151940               /home/nano01/a/sharm418/gp-cim/timeloop_mustafa/accelergy-timeloop-infrastructure/src/timeloop/build/libtimeloop-mapper.so
7f6cee939000-7f6ceeb65000 r--p 00aa2000 00:38 163508151940               /home/nano01/a/sharm418/gp-cim/timeloop_mustafa/accelergy-timeloop-infrastructure/src/timeloop/build/libtimeloop-mapper.so
7f6ceeb65000-7f6ceeb89000 r--p 00cce000 00:38 163508151940               /home/nano01/a/sharm418/gp-cim/timeloop_mustafa/accelergy-timeloop-infrastructure/src/timeloop/build/libtimeloop-mapper.so
7f6ceeb89000-7f6ceebc0000 rw-p 00cf2000 00:38 163508151940               /home/nano01/a/sharm418/gp-cim/timeloop_mustafa/accelergy-timeloop-infrastructure/src/timeloop/build/libtimeloop-mapper.so
7f6ceebc0000-7f6ceebcb000 rw-p 00000000 00:00 0 
7f6ceebcb000-7f6ceebe1000 r--p 00000000 00:2b 68961789643                /home/min/a/sharm418/saramago/boost_1_80_0/build/lib/libboost_serialization.so.1.80.0
7f6ceebe1000-7f6ceebfe000 r-xp 00016000 00:2b 68961789643                /home/min/a/sharm418/saramago/boost_1_80_0/build/lib/libboost_serialization.so.1.80.0
7f6ceebfe000-7f6ceec0c000 r--p 00033000 00:2b 68961789643                /home/min/a/sharm418/saramago/boost_1_80_0/build/lib/libboost_serialization.so.1.80.0
7f6ceec0c000-7f6ceec0f000 r--p 00041000 00:2b 68961789643                /home/min/a/sharm418/saramago/boost_1_80_0/build/lib/libboost_serialization.so.1.80.0
7f6ceec0f000-7f6ceec10000 rw-p 00044000 00:2b 68961789643                /home/min/a/sharm418/saramago/boost_1_80_0/build/lib/libboost_serialization.so.1.80.0
7f6ceec10000-7f6ceec32000 r-xp 00000000 fd:00 50332471                   /usr/lib64/ld-2.17.so
7f6ceec32000-7f6ceec33000 rw-p 00000000 00:00 0 
7f6ceec33000-7f6ceec35000 r--p 00000000 00:36 9052346                    /home/msee286pc10/a/sharm418/anaconda3/envs/py39/lib/libbz2.so.1.0.8
7f6ceec35000-7f6ceec43000 r-xp 00002000 00:36 9052346                    /home/msee286pc10/a/sharm418/anaconda3/envs/py39/lib/libbz2.so.1.0.8
7f6ceec43000-7f6ceec45000 r--p 00010000 00:36 9052346                    /home/msee286pc10/a/sharm418/anaconda3/envs/py39/lib/libbz2.so.1.0.8
7f6ceec45000-7f6ceec46000 r--p 00011000 00:36 9052346                    /home/msee286pc10/a/sharm418/anaconda3/envs/py39/lib/libbz2.so.1.0.8
7f6ceec46000-7f6ceec47000 rw-p 00012000 00:36 9052346                    /home/msee286pc10/a/sharm418/anaconda3/envs/py39/lib/libbz2.so.1.0.8
7f6ceec47000-7f6ceec4b000 rw-p 00000000 00:00 0 
7f6ceec4b000-7f6ceec53000 r--p 00000000 00:2b 68961627899                /home/min/a/sharm418/saramago/boost_1_80_0/build/lib/libboost_iostreams.so.1.80.0
7f6ceec53000-7f6ceec5b000 r-xp 00008000 00:2b 68961627899                /home/min/a/sharm418/saramago/boost_1_80_0/build/lib/libboost_iostreams.so.1.80.0
7f6ceec5b000-7f6ceec5f000 r--p 00010000 00:2b 68961627899                /home/min/a/sharm418/saramago/boost_1_80_0/build/lib/libboost_iostreams.so.1.80.0
7f6ceec5f000-7f6ceec60000 r--p 00014000 00:2b 68961627899                /home/min/a/sharm418/saramago/boost_1_80_0/build/lib/libboost_iostreams.so.1.80.0Aborted (core dumped)

the libs versions are boost 1.8.0 libconfig 1.7.3 gcc 11.2

angshuman-parashar commented 2 years ago

Boost seems to be very sensitive to gcc version. Are you running the packages from a stock distro, or have you upgraded/downgraded?

MustafaFayez commented 2 years ago

I think that is the issue, I am running on a server, but I built boost yaml-cpp and libconfig from source to upgrade them. the stock distro versions were too old and showed compilation errors.

Edit: The OS is CentOS 7

angshuman-parashar commented 2 years ago

Did you build boost and timeloop with the same compiler and library environment?

MustafaFayez commented 2 years ago

Yes, I used gcc/11.2 for both

Is there recommended versions of such libs to run on CentOS.

angshuman-parashar commented 2 years ago

We are using boost 1.76.0 with gcc 11.2.0 in our internal CentOS 6 systems.

Honestly boost has been painful to deal with. The only reason we use it today is to dump out the stats/specs in XML format, which is then used by the regression scripts (via the Python parsers in the scripts/ directory).

If we had the time we would instead dump out the stats in YAML since yaml-cpp has been relatively painless, and rewrite the Python parser to read YAML instead of XML. Since this is an open-source project, we welcome contributions :-).

MustafaFayez commented 2 years ago

Agreed that boost is a pain :), I also have suffered with it recently. Migrating entirely to yaml-cpp seems like an attractive option.

I am currently talking to our server admins to make sure boost is packaged with a working version.

For anyone else facing something similar, I used a lightweight Arch Linux container named JuNest (Jailed User NEST) that allows to have disposable and partial isolated GNU/Linux environments within any generic GNU/Linux host OS and without the need to have root privileges for installing packages. https://github.com/fsquillace/junest I used it because some system admins don't allow Docker due to security issues. It works well for me, and is pretty easy to use.