gigablast / open-source-search-engine

Nov 20 2017 -- A distributed open source search engine and spider/crawler written in C/C++ for Linux on Intel/AMD. From gigablast dot com, which has binaries for download. See the README.md file at the very bottom of this page for instructions.
Apache License 2.0
1.53k stars 438 forks source link

Master Crashing when using Spider #61

Open MikeLx opened 8 years ago

MikeLx commented 8 years ago

The new Master file, with the merges from ia and diffbot-testing, crash when using the spider to load pages into the index at relatively high speed. Configuration used was 4 node non-mirrored cluster on Centos 7 machine running 24 cores, 64gb memory, and engine directory on Samsung 850 Pro 2TB SSD drive. This does not happen with diffbot-testing on the same machine using the same configuration, etc. Diffbot-testing is solid.

gigablast commented 8 years ago

do you know where the crash occurs?

can you execute the addr2line statements at the end of the crashed gb's log file and see what they say? at least the first 5 or so.

On 10/17/2015 01:36 PM, MikeLx wrote:

The new Master file, with the merges from ia and diffbot-testing, crash when using the spider to load pages into the index at relatively high speed. Configuration used was 4 node non-mirrored cluster on Centos 7 machine running 24 cores, 64gb memory, and engine directory on Samsung 850 Pro 2TB SSD drive. This does not happen with diffbot-testing on the same machine using the same configuration, etc. Diffbot-testing is solid.

— Reply to this email directly or view it on GitHub https://github.com/gigablast/open-source-search-engine/issues/61.

WARNING: CONFIDENTIALITY NOTICE: This E-mail and the materials attached are the private confidential property of the sender, and the message and attachments are privileged communications intended solely for the receipt, use, benefit, and information of the intended recipient indicated above. If you are not the intended recipient, you are hereby notified that any review, disclosure, copying, distribution, or the taking of any other action in reliance on the contents of this transmission is strictly prohibited, and may result in legal liability on your part. If you have received this transmission in error, please notify the sender immediately by replying to the sender, then fully delete the transmission from your computer and destroy any copies hereof. Your cooperation is appreciated.

MikeLx commented 8 years ago

This is part of the log file generated after the shard went dead and auto-restarted. Let me know if you need more of the log.

site=www.maphappy.org siteroot=0 pathdepth=3 lastindexed=Oct-15-2015(10:22:17)(1444904537) contentinjected=0 urlinjected=0 isaddurl=0 errcnt=0 url=http://maphappy.org/2015/04/say-it-right-can-you-split-this-bill/ : Doc unchanged 1444905389103 002 build: coll=main collnum=0 ip=192.0.78.12 firstip=192.0.78.12 spidered=Oct-15-2015(10:36:28)(1444905388) scheduledtime=Oct-15-2015(06:31:48)(1444890708) discoverydate=Oct-15-2015(06:31:48)(1444890708) firsttime=1 goodinlinks=0 docid=200835723496 siteinlinks=0000 siterank=0 pageinlinks=0001 uh48=101630783377064 charset=UTF-8 ctype=text parentlang=01(en) lang=01(en) hopcount=01 contentlen=000042 robotstxtlen=0704 robotsallowed=1 ch32=1842720831 dh32=2146651843 sh32=3397914660 isrss=0 hasrssoutlink=0 addlistsize=00125 addspiderreqsize=00000 addspiderrepsize=00076 addstatusdocsize=00000 thumbnail=none urlfilternum=17 diffboterror=0 site=blog.ninds.nih.gov siteroot=0 pathdepth=0 contentinjected=0 urlinjected=0 isaddurl=0 spiderlinks=1 exactcontenthash=17646883285145208533 oldpriority=40 errcnt=0 httpstatus=405 url=http://blog.ninds.nih.gov/xmlrpc.php : Doc bad http status 1444905389366 002 build: coll=automobile collnum=6 ip=149.174.144.29 firstip=149.174.144.29 fakesreqfirstip=149.174.144.44 spidered=Oct-15-2015(10:36:28)(1444905388) scheduledtime=Oct-15-2015(10:31:35)(1444905095) discoverydate=Oct-15-2015(10:31:35)(1444905095) firsttime=1 docid=257715817993 siteinlinks=0482 siterank=11 pageinlinks=0001 uh48=131099599534409 charset=UTF-8 ctype=html parentlang=01(en) lang=01(en) hopcount=01 contentlen=036645 robotstxtlen=0894 robotsallowed=1 ch32=2056500715 dh32=3278583608 sh32=4289727567 ispermalink=0 isrss=0 addlistsize=00077 addspiderreqsize=00000 addspiderrepsize=00076 addstatusdocsize=00000 thumbnail=none diffboterror=0 site=www.autoblog.com siteroot=0 pathdepth=2 lastindexed=Oct-15-2015(07:16:15)(1444893375) contentinjected=0 urlinjected=0 isaddurl=0 oldurlfilternum=17 oldpriority=40 errcnt=0 url=http://www.autoblog.com/contact/feedback/ : Doc unchanged 1444905389373 002 msg22: could not find title rec for docid 9940255682 collnum=0 1444905389373 002 db: Had error getting title record from titledb: Record not found. 1444905389373 002 net: Multicast got error in reply from hostId 2 (msgType=0x22 transId=55644 nice=2 net=default): Record not found. 1444905389373 002 db: Had error getting title record for docId of 9940255682: Record not found. 1444905389373 002 query: Had error generating msg20 reply for d=9940255682: Record not found 1444905389501 002 msg22: could not find title rec for docid 75520345967 collnum=3 1444905389501 002 db: Had error getting title record from titledb: Record not found. 1444905389501 002 net: Multicast got error in reply from hostId 2 (msgType=0x22 transId=55660 nice=2 net=default): Record not found. 1444905389501 002 db: Had error getting title record for docId of 75520345967: Record not found. 1444905389501 002 query: Had error generating msg20 reply for d=75520345967: Record not found 1444905390279 002 table: grewtable posdb-indx from 65536 to 131072 slots in 3 ms (this=0x7fff7a256000) (used=16384) 1444905390292 002 xmldoc: reallocated big table! bad. old=65536 new=131072 nw=14125 1444905390308 002 build: coll=viral collnum=3 ip=137.116.112.78 firstip=137.116.112.78 spidered=Oct-15-2015(10:36:29)(1444905389) scheduledtime=Oct-15-2015(10:13:18)(1444903998) discoverydate=Oct-15-2015(10:13:18)(1444903998) firstindexed=Oct-15-2015(10:36:29)(1444905389) firsttime=1 goodinlinks=0 docid=46417695433 siteinlinks=0000 siterank=0 pageinlinks=0001 uh48=269426766497865 charset=UTF-8 ctype=xml parentlang=01(en) lang=01(en) country=00(zz) hopcount=01 contentlen=060210 robotstxtlen=0162 robotsallowed=1 ch32=3796421057 dh32=2889548619 sh32=3255929461 ispermalink=0 isrss=1 hasrssoutlink=0 outlinksadded=0040 outlinksaddedfromsamedomain=0040 addlistsize=547049 addspiderreqsize=00000 addspiderrepsize=00076 addstatusdocsize=00000 thumbnail=none priority=45 urlfilternum=13 diffboterror=0 site=www.playbuzz.com siteroot=0 pathdepth=1 contentinjected=0 urlinjected=0 isaddurl=0 spiderlinks=1 exactcontenthash=10526314807856479511 isadult=0 errcnt=0 url=http://www.playbuzz.com/rss/Funny : Success 1444905390441 002 msg22: could not find title rec for docid 28622385364 collnum=0 1444905390441 002 db: Had error getting title record from titledb: Record not found. 1444905390441 002 net: Multicast got error in reply from hostId 2 (msgType=0x22 transId=55752 nice=2 net=default): Record not found. 1444905390441 002 db: Had error getting title record for docId of 28622385364: Record not found. 1444905390441 002 query: Had error generating msg20 reply for d=28622385364: Record not found 1444905390862 002 tagdb: corrupt tag recsize 1344300153 1444905390862 002 mem: addMem(1344301308): SafeBuf. ptr=0x7fe530994014 used=1936586166 1444905391060 002 loop: sigbadhandler. disabling handler from recall. 1444905391060 002 gb: seg fault. printing stack trace. use 'addr2line -e gb' to decode the hex below. 1444905391060 002 addr2line -e gb 0x574f4f 1444905391060 002 addr2line -e gb 0x575c7f 1444905391060 002 addr2line -e gb 0x7fe5ff12c130 1444905391060 002 addr2line -e gb 0x7fe5fdf78564 1444905391060 002 addr2line -e gb 0x7fe5fdf721d7 1444905391060 002 addr2line -e gb 0x5f0767 1444905391060 002 addr2line -e gb 0x6715e5 1444905391060 002 addr2line -e gb 0x493952 1444905391060 002 addr2line -e gb 0x672de5 1444905391061 002 addr2line -e gb 0x56cb7d 1444905391061 002 addr2line -e gb 0x4f3ad0 1444905391061 002 addr2line -e gb 0x4f4471 1444905391061 002 addr2line -e gb 0x4f495a 1444905391061 002 addr2line -e gb 0x574640 1444905391061 002 addr2line -e gb 0x57540f 1444905391061 002 addr2line -e gb 0x57597d 1444905391061 002 addr2line -e gb 0x40bdc7 1444905391061 002 addr2line -e gb 0x40593c 1444905391061 002 addr2line -e gb 0x7fe5fdf0aaf5 1444905391061 002 addr2line -e gb 0x4057c9 1444905391061 002 loop: sigbadhandler. trying to save now. mode=0 1444905391061 002 mem: checking mem for breeches 1444905391133 002 gb: Shutting down urgently. Timed try #0. 1444905391133 002 gb: disabling threads 1444905391134 002 gb: trying to shutdown