Infinite loop gradually using all available memory during indexing

kroky commented 8 months ago

Describe the bug We have a specific use-case when manticore hangs during indexing of a series of documents, starts to grow its memory usage until all memory is used and is killed by the OS.

To Reproduce I will upload the schema, actual database files at time of crash, output of "show threads" and thread info from gdb to your s3 storage. I will also upload an sql file containing all the sql commands we execute prior to the server hanging.

Expected behavior Indexing should continue normally and finish.

Describe the environment:

 - Manticore 6.2.12 dc5144d35@230822 (columnar 2.2.4 5aec342@230822) (secondary 2.2.4 5aec342@230822)
 - Linux system76-pc 6.6.10-76060610-generic #202401051437~1704728131~22.04~24d69e2~dev-Ubuntu SMP PREEMPT_DY x86_64 x86_64 x86_64 GNU/Linux

Messages from log files: searchd.log is empty at the time server hangs. query.log:

/* Wed Feb 28 09:18:04.676 2024 conn 81 */ DESC tiki_main_65dede2ca49e0 # error=no such table 'tiki_main_65dede2ca49e0'
/* Wed Feb 28 09:18:04.676 2024 conn 81 */ DESC tiki_main_65dede2ca49e0 # error=no such table 'tiki_main_65dede2ca49e0'
/* Wed Feb 28 09:18:04.845 2024 conn 81 real 0.029 wall 0.029 found 1 */ SELECT * FROM tiki_main_65de599384a6c WHERE object_type='trackeritem' AND object_id='2847' LIMIT 50 OPTION cutoff=0;
/* Wed Feb 28 09:18:04.887 2024 conn 81 real 0.029 wall 0.029 found 1 */ SELECT * FROM tiki_main_65de599384a6c WHERE object_type='trackeritem' AND object_id='2856' LIMIT 50 OPTION cutoff=0;
/* Wed Feb 28 09:18:04.919 2024 conn 81 */ SHOW INDEX tiki_main_65dede2ca49e0pq STATUS # error=SHOW TABLE STATUS requires an existing table
/* Wed Feb 28 09:18:04.920 2024 conn 81 */ SHOW INDEX tiki_main_65dede2ca49e0pq STATUS # error=SHOW TABLE STATUS requires an existing table
/* Wed Feb 28 09:18:04.994 2024 conn 81 real 0.030 wall 0.030 found 1 */ SELECT * FROM tiki_main_65de599384a6c WHERE object_type='trackeritem' AND object_id='88' LIMIT 50 OPTION cutoff=0;
/* Wed Feb 28 09:18:05.067 2024 conn 81 real 0.029 wall 0.029 found 1 */ SELECT * FROM tiki_main_65de599384a6c WHERE object_type='trackeritem' AND object_id='2792' LIMIT 50 OPTION cutoff=0;
/* Wed Feb 28 09:18:05.269 2024 conn 81 real 0.031 wall 0.031 found 1 */ SELECT * FROM tiki_main_65de599384a6c WHERE object_type='trackeritem' AND object_id='696' LIMIT 50 OPTION cutoff=0;
/* Wed Feb 28 09:18:05.629 2024 conn 81 real 0.033 wall 0.033 found 1 */ SELECT * FROM tiki_main_65de599384a6c WHERE object_type='trackeritem' AND object_id='784' LIMIT 50 OPTION cutoff=0;
/* Wed Feb 28 09:18:05.670 2024 conn 81 real 0.030 wall 0.030 found 1 */ SELECT * FROM tiki_main_65de599384a6c WHERE object_type='trackeritem' AND object_id='5' LIMIT 50 OPTION cutoff=0;
/* Wed Feb 28 09:18:05.710 2024 conn 81 real 0.030 wall 0.030 found 1 */ SELECT * FROM tiki_main_65de599384a6c WHERE object_type='trackeritem' AND object_id='4' LIMIT 50 OPTION cutoff=0;
/* Wed Feb 28 09:18:05.762 2024 conn 81 real 0.038 wall 0.038 found 1 */ SELECT * FROM tiki_main_65de599384a6c WHERE object_type='trackeritem' AND object_id='777' LIMIT 50 OPTION cutoff=0;
/* Wed Feb 28 09:18:05.804 2024 conn 81 real 0.031 wall 0.031 found 1 */ SELECT * FROM tiki_main_65de599384a6c WHERE object_type='trackeritem' AND object_id='778' LIMIT 50 OPTION cutoff=0;
/* Wed Feb 28 09:18:05.844 2024 conn 81 real 0.030 wall 0.030 found 1 */ SELECT * FROM tiki_main_65de599384a6c WHERE object_type='trackeritem' AND object_id='777' LIMIT 50 OPTION cutoff=0;
/* Wed Feb 28 09:18:05.885 2024 conn 81 real 0.032 wall 0.032 found 1 */ SELECT * FROM tiki_main_65de599384a6c WHERE object_type='trackeritem' AND object_id='778' LIMIT 50 OPTION cutoff=0;
/* Wed Feb 28 09:18:06.600 2024 conn 81 real 0.033 wall 0.033 found 1 */ SELECT * FROM tiki_main_65de599384a6c WHERE object_type='trackeritem' AND object_id='2731' LIMIT 50 OPTION cutoff=0;
/* Wed Feb 28 09:18:06.640 2024 conn 81 real 0.030 wall 0.030 found 1 */ SELECT * FROM tiki_main_65de599384a6c WHERE object_type='trackeritem' AND object_id='36' LIMIT 50 OPTION cutoff=0;
/* Wed Feb 28 09:18:06.836 2024 conn 81 real 0.030 wall 0.030 found 1 */ SELECT * FROM tiki_main_65de599384a6c WHERE object_type='trackeritem' AND object_id='88' LIMIT 50 OPTION cutoff=0;
/* Wed Feb 28 09:18:06.872 2024 conn 81 real 0.026 wall 0.026 found 1 */ SELECT * FROM tiki_main_65de599384a6c WHERE object_type='trackeritem' AND object_id='2854' LIMIT 50 OPTION cutoff=0;
/* Wed Feb 28 09:18:06.909 2024 conn 81 real 0.027 wall 0.027 found 1 */ SELECT * FROM tiki_main_65de599384a6c WHERE object_type='trackeritem' AND object_id='2855' LIMIT 50 OPTION cutoff=0;
/* Wed Feb 28 09:18:06.944 2024 conn 81 real 0.026 wall 0.026 found 1 */ SELECT * FROM tiki_main_65de599384a6c WHERE object_type='trackeritem' AND object_id='2857' LIMIT 50 OPTION cutoff=0;
/* Wed Feb 28 09:18:06.992 2024 conn 81 real 0.040 wall 0.040 found 1 */ SELECT * FROM tiki_main_65de599384a6c WHERE object_type='trackeritem' AND object_id='2858' LIMIT 50 OPTION cutoff=0;
/* Wed Feb 28 09:18:07.032 2024 conn 81 real 0.029 wall 0.029 found 1 */ SELECT * FROM tiki_main_65de599384a6c WHERE object_type='trackeritem' AND object_id='2859' LIMIT 50 OPTION cutoff=0;
/* Wed Feb 28 09:18:07.071 2024 conn 81 real 0.029 wall 0.029 found 1 */ SELECT * FROM tiki_main_65de599384a6c WHERE object_type='trackeritem' AND object_id='2860' LIMIT 50 OPTION cutoff=0;
/* Wed Feb 28 09:18:07.197 2024 conn 81 real 0.034 wall 0.034 found 1 */ SELECT * FROM tiki_main_65de599384a6c WHERE object_type='trackeritem' AND object_id='1' LIMIT 50 OPTION cutoff=0;
/* Wed Feb 28 09:18:07.240 2024 conn 81 real 0.031 wall 0.031 found 1 */ SELECT * FROM tiki_main_65de599384a6c WHERE object_type='trackeritem' AND object_id='4' LIMIT 50 OPTION cutoff=0;
/* Wed Feb 28 09:18:07.299 2024 conn 81 real 0.028 wall 0.028 found 1 */ SELECT * FROM tiki_main_65de599384a6c WHERE object_type='trackeritem' AND object_id='7' LIMIT 50 OPTION cutoff=0;
/* Wed Feb 28 09:18:07.337 2024 conn 81 real 0.029 wall 0.029 found 1 */ SELECT * FROM tiki_main_65de599384a6c WHERE object_type='trackeritem' AND object_id='703' LIMIT 50 OPTION cutoff=0;
/* Wed Feb 28 09:18:13.618 2024 conn 81 real 0.032 wall 0.032 found 1 */ SELECT * FROM tiki_main_65de599384a6c WHERE object_type='trackeritem' AND object_id='88' LIMIT 50 OPTION cutoff=0;
/* Wed Feb 28 09:18:22.033 2024 conn 81 real 0.035 wall 0.035 found 1 */ SELECT * FROM tiki_main_65de599384a6c WHERE object_type='trackeritem' AND object_id='2790' LIMIT 50 OPTION cutoff=0;
/* Wed Feb 28 09:18:22.276 2024 conn 81 real 0.036 wall 0.036 found 1 */ SELECT * FROM tiki_main_65de599384a6c WHERE object_type='trackeritem' AND object_id='2786' LIMIT 50 OPTION cutoff=0;
/* Wed Feb 28 09:18:22.322 2024 conn 81 real 0.032 wall 0.032 found 1 */ SELECT * FROM tiki_main_65de599384a6c WHERE object_type='trackeritem' AND object_id='2789' LIMIT 50 OPTION cutoff=0;
/* Wed Feb 28 09:18:22.364 2024 conn 81 real 0.029 wall 0.029 found 1 */ SELECT * FROM tiki_main_65de599384a6c WHERE object_type='trackeritem' AND object_id='2791' LIMIT 50 OPTION cutoff=0;

Additional context gdb has an interesting output in one of the threads:

Thread 16 (Thread 0x7df7c3d7b640 (LWP 2984535) "work_14"):
#0  0x00007df7c41567ed in pgm::internal::OptimalPiecewiseLinearModel<float, unsigned long>::CanonicalSegment::get_floating_point_segment(float const&) const () from /usr/share/manticore/modules/lib_mantic
ore_secondary.so
#1  0x00007df7c4156483 in pgm::PGMIndex<float, 8ul, 4ul, float>::Segment::Segment(pgm::internal::OptimalPiecewiseLinearModel<float, unsigned long>::CanonicalSegment const&) () from /usr/share/manticore/mo
dules/lib_manticore_secondary.so
#2  0x00007df7c4156e1b in unsigned long pgm::internal::make_segmentation<pgm::PGMIndex<float, 8ul, 4ul, float>::build<float*>(float*, float*, unsigned long, unsigned long, std::vector<pgm::PGMIndex<float,
 8ul, 4ul, float>::Segment, std::allocator<pgm::PGMIndex<float, 8ul, 4ul, float>::Segment> >&, std::vector<unsigned long, std::allocator<unsigned long> >&, std::vector<unsigned long, std::allocator<unsign
ed long> >&)::{lambda(auto:1)#3}, pgm::PGMIndex<float, 8ul, 4ul, float>::build<float*>(float*, float*, unsigned long, unsigned long, std::vector<pgm::PGMIndex<float, 8ul, 4ul, float>::Segment, std::alloca
tor<pgm::PGMIndex<float, 8ul, 4ul, float>::Segment> >&, std::vector<unsigned long, std::allocator<unsigned long> >&, std::vector<unsigned long, std::allocator<unsigned long> >&)::{lambda(auto:1)#2}>(unsig
ned long, unsigned long, pgm::PGMIndex<float, 8ul, 4ul, float>::build<float*>(float*, float*, unsigned long, unsigned long, std::vector<pgm::PGMIndex<float, 8ul, 4ul, float>::Segment, std::allocator<pgm::
PGMIndex<float, 8ul, 4ul, float>::Segment> >&, std::vector<unsigned long, std::allocator<unsigned long> >&, std::vector<unsigned long, std::allocator<unsigned long> >&)::{lambda(auto:1)#3}, pgm::PGMIndex<
float, 8ul, 4ul, float>::build<float*>(float*, float*, unsigned long, unsigned long, std::vector<pgm::PGMIndex<float, 8ul, 4ul, float>::Segment, std::allocator<pgm::PGMIndex<float, 8ul, 4ul, float>::Segme
nt> >&, std::vector<unsigned long, std::allocator<unsigned long> >&, std::vector<unsigned long, std::allocator<unsigned long> >&)::{lambda(auto:1)#2}) () from /usr/share/manticore/modules/lib_manticore_se
condary.so
#3  0x00007df7c4154c10 in void pgm::PGMIndex<float, 8ul, 4ul, float>::build<float*>(float*, float*, unsigned long, unsigned long, std::vector<pgm::PGMIndex<float, 8ul, 4ul, float>::Segment, std::allocator
<pgm::PGMIndex<float, 8ul, 4ul, float>::Segment> >&, std::vector<unsigned long, std::allocator<unsigned long> >&, std::vector<unsigned long, std::allocator<unsigned long> >&) () from /usr/share/manticore/
modules/lib_manticore_secondary.so
#4  0x00007df7c414c2a5 in SI::PGM_T<float>::PGM_T<float*>(float*, float*) () from /usr/share/manticore/modules/lib_manticore_secondary.so
#5  0x00007df7c414b658 in SI::SIWriter_T<float, unsigned int>::Process(util::FileWriter_c&, util::FileWriter_c&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std
::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) () from /usr/share/manticore/modules/lib_manticore_secondary.so
#6  0x00007df7c4146b04 in SI::Builder_c::Done(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) () from /usr/share/manticore/modules/lib_manticore_secondary.so
#7  0x000055ab2c4f3a8c in RtIndex_c::WriteAttributes(SaveDiskDataContext_t&, CSphString&) const ()
#8  0x000055ab2c4f563e in RtIndex_c::SaveDiskData(char const*, VecTraits_T<CSphRefcountedPtr<RtSegment_t const> > const&, ChunkStats_t const&, CSphString&) const ()
#9  0x000055ab2c4f1a9b in RtIndex_c::SaveDiskChunk(bool, bool, bool) ()
#10 0x000055ab2ca21dac in Threads::CoRoutine_c::CreateContext(std::function<void ()>, std::pair<boost::context::stack_context, Threads::StackFlavour_E>)::{lambda(boost::context::detail::transfer_t)#1}::__
invoke(boost::context::detail::transfer_t) ()
#11 0x000055ab2ca42167 in make_fcontext ()
#12 0x0000002000004a60 in ?? ()
#13 0x0000000000004015 in ?? ()
#14 0x00001a0200000000 in ?? ()
#15 0x00000001000007e6 in ?? ()
#16 0x000000036320923b in ?? ()
#17 0x000000076320923c in ?? ()
#18 0x000000016320925e in ?? ()
#19 0x0000002a6320925f in ?? ()
#20 0x00000001632092dd in ?? ()
#21 0x00000001632841e7 in ?? ()
#22 0x00000001634e8b1c in ?? ()
#23 0x0000000100000000 in ?? ()
#24 0x0000000100000000 in ?? ()
#25 0x0000000100000000 in ?? ()
#26 0x0000000100000000 in ?? ()
#27 0x0000000100000000 in ?? ()
#28 0x0000000100000000 in ?? ()
#29 0x0000000100000000 in ?? ()
#30 0x0000000100000000 in ?? ()
#31 0x0000000100000000 in ?? ()
#32 0x0000000100000000 in ?? ()

and the last line is repeated more than 2000 times... it seems the secondary index has a hard time writing some data to disk chunks. I also notice that we have more than a thousand .tmp files in the index disk directory. Seems like an infinite loop or recursion.

sanikolaev commented 8 months ago

Chances are this has been already fixed in the dev version (https://mnt.cr/dev/nightly). Can you check if you can still reproduce it in it?

kroky commented 8 months ago

I tried with latest from today: Manticore 6.2.13 e705636a4@24030111 dev (columnar 2.2.5 a5342a1@240217) (secondary 2.2.5 a5342a1@240217) (knn 2.2.5 a5342a1@240217) and the same issue happens - seems like infinite recursion in one of the threads.

sanikolaev commented 8 months ago

Thank you @kroky We'll try to reproduce the issue on our end to debug and fix it.

sanikolaev commented 8 months ago

I've taken a quick look, and it well may be an edge case caused by the schema having 1444 attributes. Although Manticore is capable of handling this many attributes, such a scenario is very uncommon and unfortunately has not been thoroughly tested.

I will also upload an sql file containing all the sql commands we execute prior to the server hanging

I can't find it in the storage:

# sudo ls /mnt/s3.manticoresearch.com/issue-1893
gdb.txt  index.tar.gz  schema.txt  show-threads.txt

Can you please provide it? A few sample data write queries should be very helpful to reproduce the issue.

kroky commented 8 months ago

Yes, our use-case is uncommon as we aggregate data from different sources into a huge index schema.

I provided a couple of more files in the s3 storage:

manticore-debug-up-to-point-of-hang.sql - output of all the SQL queries we execute against the manticore server before it hangs and starts to eat memory
manticore-debug.sql - if I kill the searchd server processes, our indexing continues and produces that file as a final output
2 files with stopwords needed for the creation of the above two indices

Note that when I try to execute these through mysql command line client, searchd doesn't hang but all the queries are successfully executed. The only difference between our indexing and this mysql command line client execution is that we use PHP PDO library to connect to port 9306 and execute those queries via prepared statements in most cases.

tomatolog commented 8 months ago

could you restart daemon with --logdebug option cause the hand stop daemon and upload searchd.log into s3?

As trying to reproduce the issue with data you provided I see the warnings in the daemon log

WARNING: rt: table tiki_main_65e705bcdee87 failed to save disk chunk l4g111/tiki_main_65e705bcdee87/tiki_main_65e705bcdee87.0: error creating 'l4g111/tiki_main_65e705bcdee87/tiki_main_65e705bcdee87.0.spidx.983.tmp': Too many open files
WARNING: rt: table tiki_main_65e705bcdee87 failed to load disk chunk after RAM save: error creating 'l4g111/tiki_main_65e705bcdee87/tiki_main_65e705bcdee87.0.spidx.983.tmp': Too many open files
WARNING: rt: table tiki_main_65e705bcdee87 failed to save disk chunk l4g111/tiki_main_65e705bcdee87/tiki_main_65e705bcdee87.1: error creating 'l4g111/tiki_main_65e705bcdee87/tiki_main_65e705bcdee87.1.spidx.983.tmp': Too many open files
WARNING: rt: table tiki_main_65e705bcdee87 failed to load disk chunk after RAM save: error creating 'l4g111/tiki_main_65e705bcdee87/tiki_main_65e705bcdee87.1.spidx.983.tmp': Too many open files

If you see the same behavior - you need to increase open file limit for user who starts daemon.

kroky commented 8 months ago

Yes, I saw the too many open files error. The limit was 1024, so I increased to 1048576. Trying again results in the same hang with increasing memory usage. I uploaded the searchd.log with debug messages.

marclaporte commented 8 months ago

As per https://manual.manticoresearch.com/Server_settings/Searchd#max_open_files the settings should be changed 1- for Manticore and 2- for the OS. This solved the issue I was having on one database (which is the same software but different data set than @kroky: A large database with Tiki master, soon to be v27)

tomatolog commented 8 months ago

I unable to reproduce the hang with sql file you provided

The only difference between our indexing and this mysql command line client execution is that we use PHP PDO library to connect to port 9306 and execute those queries via prepared statements in most cases.

Could you upload PHP script that use PDO library as you described that hung the daemon or maybe Docker file that reproduce the case in the container to make sure environment is the same?

kroky commented 8 months ago

I'll need to figure it out as the software behind this is pretty big and depends on a complex DB to generate the sql queries.

marclaporte commented 8 months ago

This solved the issue I was having on one database

But it doesn't solve it on a large database on Tiki 26.x

tomatolog commented 8 months ago

If you provide the reproducible case I could investigate issue and fix it. As for now I do not see the crash or other cause of issue you describing.

I think that could be possible if SI generation hits open FD limit as described https://github.com/manticoresoftware/manticoresearch/issues/1914 . However set open file limit to a higher value or disable SI by issue export LIB_SECONDARY= prior to daemon start should fix the issue. However the issue still persists and I still need reproducible example to investigate it.

marclaporte commented 7 months ago

Some related analysis has been added here: https://github.com/manticoresoftware/manticoresearch/issues/1914

tomatolog commented 7 months ago

@kroky said he increased open file limit but the issues still exists - seems related issue is not cause the bug described here

sanikolaev commented 7 months ago

@kroky, could you attempt another run, checking the open files limits via /proc/pid/limits for searchd? Please ensure that there is indeed no limit, or that it is set very high.

kroky commented 7 months ago

I am trying to prepare a docker image for you to test there.

kroky commented 7 months ago

Actually, having max_open_files = max in searchd config with sufficient system open files limit solved the issue and index is rebuilt successfully. I think we can close this one then in favor of #1914.

manticoresoftware / manticoresearch

Infinite loop gradually using all available memory during indexing #1893