Open streety opened 6 years ago
I'm actually using MySQL, since I've wanted to get more experience with it. So I set it up to create a full-text search index, which might take a while but it's doing it. For future reference, this part should probably be done last, right before the board goes live.
The search seems to have the complete index now but it is slow. Actually, for a simple/common phrase it timeouts entirely. If you search for 'ethanol distillation' it will load for >30 secs and then return a blank page.
The MySQL fulltext search worked, but left a lot to be desired. Currently trying to get sphinxsearch
working:
http://sphinxsearch.com/docs/current.html
If we can get Sphinx powering the fulltext search, that'd blow away all the other search options that are available. I should have reopened this as soon as I started working on it.
It doesn't seem to get the attention of Lucene, Solr, etc but I was very happy with it when I used it back in 2010. I'm sure it will work well here.
Any chance you could take a look at the issue with Sphinx? It's set up to emulate MySQL on port 9306 and has its regular interface on port 9312. You can query it by running mysql -h0 -P9306
then entering something like:
SELECT * FROM index_phpbb_sm_main WHERE MATCH('propylene carbonate');
Config file is at /etc/sphinxsearch/sphinx.conf
. Relevant phpBB source files are at:
/phpbb/search/fulltext_sphinx.php
/phpbb/search/sphinx/*
/includes/sphinxapi.php
I haven't set up refreshing with crontab yet, but it's done the first initial scan of the database easily enough. Then I can start work on a new parsing system, which I'm sure I'd be able to make much faster progress on.
For future reference though, we probably should configure the search last. Reason being, in-memory databases like Redis or Sphinx are obviously memory hogs, and so is running a virtual machine. So it probably makes the most sense to not set up that type of search until after we're done pulling data from the VM.
Okay, I'll take a look at this. Might not have time today but definitely later in the week.
On Mon, Oct 1, 2018, 9:30 AM Tom Oldani notifications@github.com wrote:
Any chance you could take a look at the issue with Sphinx? It's set up to emulate MySQL on port 9306 and has its regular interface on port 9312. You can query it by running mysql -h0 -P9306 then entering something like:
SELECT * FROM index_phpbb_sm_main WHERE MATCH('propylene carbonate');
Config file is at /etc/sphinxsearch/sphinx.conf. Relevant phpBB source files are at: /phpbb/search/fulltext_sphinx.php /phpbb/search/sphinx/* /includes/sphinxapi.php
I haven't set up refreshing with crontab yet, but it's done the first initial scan of the database easily enough. Then I can start work on a new parsing system, which I'm sure I'd be able to make much faster progress on.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/toldani/sm-transition/issues/5#issuecomment-425908137, or mute the thread https://github.com/notifications/unsubscribe-auth/AAs75XRqt2DBBclH4rnJC4SrYpFDcvP0ks5ughj7gaJpZM4XAgQL .
I started the sphinx service and ran a few queries. Looking at the logs at /var/log/sphinxsearch/query.log the issue doesn't seem to be with sphinx. The queries are running quickly, in a few hundredths of a second.
I couldn't connect to sphinx via the mysql command. I haven't looked into that further. Doesn't seem to be vital.
searchd currently seems to be using 1.8% memory. The VM is using 22.1% and 17.17, mysql is using 19.1% for comparison.
I have not had any queries timeout but some remain slow.
I'll have to take a break for a while. Next step is probably profiling the PHP side. Will hopefully return late afternoon/early evening.
Currently search seems to be incomplete. Does the search index need to be rebuilt?
Should we switch to postgresql full text?