Closed judahnator closed 6 years ago
This seems legit at first look but has some downsides. Using the saveToIndex
function directly will not update the total documents number which will affect the BM25 matching algorithm. Also, there is no speed gain comparing to php artisan scout:import App\\Article
which includes everything from the toSearchableArray
function. The tntsearch:import
command is very fast because it leverages MYSQL_ATTR_USE_BUFFERED_QUERY and this is achieved by querying the db table directly
For the total documents number, that should be as simple as adding this before the script finishes.
$indexer->updateInfoTable('total_documents', $model::count());
I will have to look into the rest, though. I have done some testing between scout:import
and tntsearch:import
and I am generally more satisfied with the results of the later. I will have to dig around and see why that might be.
My other PR ( #130) left me with a bad taste in my mouth. I tossed and turned all night, not content with leaving data that the user has displayed an interest in making searchable out of the search index.
I spent some time reverse engineering how the data was being inserted into the indexes, and I believe I have come up with a solution that will allow the indexer to index all the data provided in the
toSearchableArray()
method and not just the available columns in the database.Lets say I have a class, "Article," and it looks like this:
Before, when the indexer ran it would have all of the authors normal attributes, plus the appended "published_at_atom" and the additional "author" item. Unfortunately the indexer would ignore these two attributes, because they did not exist as columns in the database. Now though, I can do something like
App\Article::search('author name')
and have it return relevant results, where before it would not.With this new change, instead of manually writing queries we can use native Laravel functions. By doing this Laravel will give us all the data we need in the
toSearchableArray()
method.We can map over the
toSearchableArray()
output to prepare the word stems, and save those to the index directly without the need for raw SQL.This has some huge advantages, but comes with one obvious disadvantage. In my testing I have seen a 2-3x increase in the time it takes to build search indexes. This might be solved down the road by having the user hint some relationships to eager load, but we can cross that bridge later.