gigablast / open-source-search-engine

Nov 20 2017 -- A distributed open source search engine and spider/crawler written in C/C++ for Linux on Intel/AMD. From gigablast dot com, which has binaries for download. See the README.md file at the very bottom of this page for instructions.
Apache License 2.0
1.54k stars 442 forks source link

mem: addMem: Init failed. Disabling checks. #187

Open tcreek opened 2 years ago

tcreek commented 2 years ago

You know, the requirements to run this are not included. I get the following error when trying to run it:

mem: addMem: Init failed. Disabling checks.
Segmentation fault (core dumped)

The system has 8GB of RAM. Is this error due to lack of memory, or something else?

tcreek commented 2 years ago

kernel: [ 104.432073] gb[925]: segfault at 15abcc0 ip 0000559b49f4d6eb sp 00007ffc643b08d0 error 4 in gb[559b49dff000+2ee000] kernel: [ 104.432087] Code: f6 4b 89 45 dc 48 8b 45 a8 8b 48 38 8b 45 dc ba 00 00 00 00 f7 f1 89 55 fc 48 8b 05 07 b8 96 00 8b 55 fc 48 c1 e2 03 48 01 d0 <48> 8b 00 48 85 c0 74 33 48 8b 05 ee b7 96 00 8b 55 fc 48 c1 e2 03

I see I have a couple of kernel events from trying to run it

tcreek commented 2 years ago

Was able to get it running. Need higher privileges to run it.

Why is the case???

lisawebcoder commented 2 years ago

i had seg fault then i fixed https://github.com/gigablast/open-source-search-engine/issues/171

so but i still cant run multple shards or nodes i get the shard or database node is not running or dead i dont know if its becuz i dont have enuff ram? i think i had 4gb RAM but its a complex steps i get lost if i dont do the steps everyday did you succedd for many shards ? if so can you list the steps cuz the steps by the original instrcuctions are a little confusing at least for me

tcreek commented 2 years ago

Thanks

i had seg fault then i fixed #171

so but i still cant run multple shards or nodes i get the shard or database node is not running or dead i dont know if its becuz i dont have enuff ram? i think i had 4gb RAM but its a complex steps i get lost if i dont do the steps everyday did you succedd for many shards ? if so can you list the steps cuz the steps by the original instrcuctions are a little confusing at least for me

Thanks for the response. I did get it running, but it soon crashed from lack of memory. Maybe we need 16GB at least? I guess I could test on another machine where I have that much memory.

I will try recompiling with the suggestion in that issue #171. At this point, I am just trying to get it to stay running on one machine.

lisawebcoder commented 2 years ago

yes it needs always lots of RAM ok cool if u have a machine w/ 16gb of memory then post here update of how it goes

tcreek commented 2 years ago

On a computer with 16GB of RAM

1652267168321 000 mem: system malloc(75497480,dmdm) availShouldBe=6271995513: Cannot allocate memory (dmdm) (ooms suppressed since last log msg = 0) 1652267168321 000 spell: Could not load unified dict from unifiedDict-buf.txt and unifiedDict-map.dat Failed to start gb. Exiting.

lisawebcoder commented 2 years ago

wow did you try my fix #171 but wait this is a differnet error code just on one node right? which ubuntu or linux distro are you on?

lisawebcoder commented 2 years ago

i will try myself maybe sunday or next week and update here

tcreek commented 2 years ago

This is just one one node. At this point, I am just trying to get it to run, and stay running on a single node before moving on.

I wonder what is going on with the dev of this.

lisawebcoder commented 2 years ago

I actually got it run on ubuntu ver 14 ,16 ,18,20 and i used the built in admin cms to crawl some web pages to add to the index and did some search queries all on 1 node cuz i don't know yet how to setup many machine or nodes and so i put it aside and just created an API which itself has many bugs and glitches but works But the dev is not consistent it seems on this project i don't know I will try running it next week and post my steps here if it still works

tcreek commented 2 years ago

Sorry, I forgot to mention Debian 11

lisawebcoder commented 2 years ago

Try using Ubuntu 14 or 16 But i think debian 11 is not compatible

onlyjob commented 2 years ago

Try using Ubuntu 14 or 16

Never use Ubuntu for anything. It is a redundant derivative. Debian is always better.

lisawebcoder commented 2 years ago

Yes ok 14 and 16 are old version true I got it to work on Ubuntu 20 But ok i will try debian

tcreek commented 2 years ago

Yes ok 14 and 16 are old version true I got it to work on Ubuntu 20 But ok i will try debian

Did you try Debian yet? How much memory you have on your test machines?

lisawebcoder commented 2 years ago

Hello No i didn't get a chance to run on debian But im no expert on the distros but i thought debian is Ubuntu Im a little confused But if you got it to work on Ubuntu 20 I personally think that's awesome and a good os version I had only 4gb of ram

tcreek commented 2 years ago

Strange you are able to run it on 4GB of RAM, but I keep running out of memory with 8GB and 16GB systems.

Have you had a chance to run it yet on Debian?

lisawebcoder commented 2 years ago

Hello No I wont run on debian anytime soon For now i left on standby on Ubuntu 20 just on 1 node yes with 4 gb ram I don't know why at higher ram you have out of memory errors Are you on virtual box or you have linux as your host os? But its a complex source code and the admin is good to crawl the web and put in database but i never got it to wprk proper on the front end I use the API and even that has glitches

tcreek commented 2 years ago

I am on bare metal. One is a Quad Core Celeron J1800 w/8GB and the other is an i7 dual core (hyper threaded) w/16GB of RAM. First one has Debian 10, and latter is Debian 11

lisawebcoder commented 2 years ago

Hello

That looks all really good Maybe the source code you pulled is corrupt I recall some repos gave me different errors but i don't know really

lisawebcoder commented 2 years ago

It seems i don't have my Ubuntu 20 os I just ran on Ubuntu 16 and it runs I am on Windows 10 and virtual box Ubuntu 16 and my total system has 4 gb which means the virtual box Ubuntu 16 os is probably using 2gb ram I will try to send the source code i used here

lisawebcoder commented 2 years ago

i dont know how to add a zip tar file here its complex or there is no feature to allow this sorry

tcreek commented 2 years ago

Actually i use the "git clone" function. It there was an issue with some code, it would fail to compile

lisawebcoder commented 2 years ago

Ya should be good I actually used the wget method cuz i did get errors with git clone but ya both should be good

tcreek commented 2 years ago

Did you ever do your testing?

Now even a touted search site Gigablast was promoting on their own site is now broken, Private.sh has been broken for a month, and no means to even contact them.

I think I am going to stop with this Gigablast nonsense and move on to SearX

lisawebcoder commented 2 years ago

Hello how are you Im sorry to hear its not working out No i could never go further than 1 node and on either Ubuntu 16 to 20 Never on debian either I only created an API version but even there it's got endpoint bugs Like it works at my place WiFi but if someone else tries from another place the url endpoint has a different code key like &erfd=45638754 So if you try my search web app you will probably get an error in the console like Unknown character< Because it's sending HTML not json But if you go to Gigablast.com And search the same query then on the results page click the drop down gear icon and select json instead of HTML Then you will get the JSON output and in the URL check what the parameters key is Like it might be &rtgh=5698745 And you can post the key parameter here and i will adjust my source code and add that specific key parameter and then it should work for you http://products.thefriendsnetwork.ca/HTML/Data/ But anyway is searchx open source? If so i will try it cuz in the meantime im making my own simple search machine with own database

tcreek commented 2 years ago

https://github.com/searx/searx

lisawebcoder commented 2 years ago

Thank you for the source code link I got to the part with update pip boilerplate and i get errors I am in virtual box Ubuntu 16

tcreek commented 2 years ago

Not sure why you would be using such an old and unsupported OS>

lisawebcoder commented 2 years ago

Ok your right I will install Ubuntu 20 os

lisawebcoder commented 2 years ago

hello im on ubuntu 20 i dont quite understand this

To install searx’s dependencies, exit the searx bash session you opened above and restart a new. Before install, first check if your virtualenv was sourced from the login (~/.profile):

so i close my terminal and re open a new terminal and enter the commnad (~/.profile)?

lisawebcoder commented 2 years ago

lisa@ubuntu:~/Desktop/searcx$ ~/.profile bash: /home/lisa/.profile: Permission denied lisa@ubuntu:~/Desktop/searcx$

lisawebcoder commented 2 years ago

(searx)$ command -v python && python --version /usr/local/searx/searx-pyenv/bin/python Python 3.8.1

i dont understand this also it takes me to python idle

update its ok im continuing im ypdating pip ok will update here

lisawebcoder commented 2 years ago

i am at the configuration section but i dont understand it

tcreek commented 2 years ago

Yo are probably better off asking them. It seems they have a more active community than here at Gigblast which is basically nothing. There is a IRC channel on libera.chat #searx

You may also have to run it with elevated permissions, or a special user belonging to a group.

lisawebcoder commented 2 years ago

hello i am stuck at i created a paswd for searx and i still get erros searx is not in the sudoers file. This incident will be reported.

lisawebcoder commented 2 years ago

ok at that step i open a second terminal and keep the 1st open i guess and then i did sudo su to be root and continued it runs in the browser

lisawebcoder commented 2 years ago

does this search app have auto complete? it doesnt so far i dont know where to add or change in the source code its a must for sure but anyway i wil contd w/ my own search machine but i know pythonanywhere it wont run it cux u need a paid account to get root and sudo privelige regards

lisawebcoder commented 2 years ago

it has auto complete in admin prefernces

tcreek commented 2 years ago

Turns out SearX is a meta search engine which means it uses other search engines to return search results. I think I would prefer to be in more control of my searches like Gigablast would do

lisawebcoder commented 2 years ago

Hello how are you Yes a meta search Gblast fails there too if you select it I still trying to finish my own search machine engine with a version good enough to use daily without too much dependence on the other search engines I just have a web crawler then a mysql database then a front end in php and html Missing auto complete and remove whitespace from user input