Closed chrisvidal closed 2 years ago
Hi, of course. Json file used only for example. Moreover, it is even better to cache in Redis/etc. Index file data is an associative array. You may choose the storage option at your discretion.
The only thing to understand. Index data must be fully loaded into script memory before starting the search.
$searchIndex->setData($indexData);
Thanks for your reply.
Let me see if I understand properly:
$searchIndex->addRecord()
method or?select * from index
then loop through all the results to build an array and do the ->setData($indexData)
?name
and description
for instance?well, obviously, I have no clue how to use this lib, and would love some more doc for a real case like mine.
thanks
Working with faceted search consists of 2 stages.
Stage 1. Building the index. We need to collect the faceted index in the background (similar to how Sphinx does it).
Fetch all rows from the database that need to be indexed and pass them to a faceted index.
$index->add(...)
Then save the resulting index to the database.
$data = $index->getData();
$yourDbStorage->saveIndexData($data);
You need to re-index periodically, depending on how often your data changes. Sometimes it is enough to do this once a day, there are times that you need to re-index in the background every 10 minutes.
Stage 2. Search When a user comes to the site and makes a request, we fetch the index data from the DB/cache and do
$index->setData($yourDbStorage->loadIndexData());
Then we search
$search->find(...);
Faceted search itself is not full-text and cannot search for incomplete matches. It only searches for a complete match of the attribute value.
It is designed to filter by attributes values.
How to be with fultext-search: https://github.com/k-samuel/faceted-search/issues/3
Thanks a lot @k-samuel for this.
I totally get the concept of it all. thanks.
It is the concrete implementation of the index that I have difficulty to envion.
How does this look like in reality $yourDbStorage->saveIndexData($data);
?
$data
is an array. How can I save this array?
I was thinking to sync the index whenever a record is updated but it does not seem to be the principle at use here?
The easiest option in pseudocode
$sql = ' INSERT INTO my_facet_table (`category`,`index_data`) VALUES ("my_index_name", "'.json_encode($index->getData()).'")';
$query->execute($sql);
where index_data is LONGTEXT or JSON column
At this moment realtime index option is not implemented. Only full indexing.
Workaround. You can create update queue for records and launch indexer after update event. For example after uploading pricelist or batch update of product records.
Thanks for your reply.
Let me see if I understand properly:
* I need to maintain the DB index table: do I need to use the `$searchIndex->addRecord()` method or? * * if I need to handle my own Index table in the DB, what rules and structure does it need to follow? * when a search occurs on the frontend, not sure what is supposed to be done here on backend: * do myself a `select * from index` then loop through all the results to build an array and do the `->setData($indexData)`? * I can see the use of the filters in the code example; I am not clear if we can search for a string as well in some fields like `name` and `description` for instance?
well, obviously, I have no clue how to use this lib, and would love some more doc for a real case like mine.
thanks
The index data is a black box by design. Think of it as the binary data of the index itself. After collecting them, you need to save them somewhere in order to later reuse.
Try running the example https://github.com/k-samuel/faceted-search/tree/master/examples and see how the code works.
Attribute values are passed as filters to the search, which can be found from $search->findAcceptableFiltersCount / findAcceptableFilters
These are exact matches of values. If you pass the "green" color to the filter, then the attribute should contain exactly that value.
Full-text search by part of the title or phrase is not provided, this is a slightly different direction.
I hope this helps.
yes it helps Thanks @k-samuel
I can spot 3 use cases in my app:
For 1./ I am thinking to create the index in real time after executing the SQL query to retrieve the records that match the searched term, then create the index because I'd like to use the findAcceptableFiltersCount
for frontend
For 2./ similar to 1./ I guess
For 3./ I will use an existing index maintained and refresh every XX minutes.
Sounds good and Interesting. Before that, I did not think that the index can be built on the basis of the query results.
For the first option, there is an alternative way - to get a list of identifiers that you found when searching for a term and pass them to findAcceptableFiltersCount as the second parameter. In this case, you do not have to rebuild the index every time.
$filters = $search->findAcceptableFiltersCount([],$foundIds);
thanks, it sounds great so far.
I am trying your idea
$foundIds
is a flat array of Ids.
$filters
is an empty array.
$data = $search->findAcceptableFiltersCount($filters, $foundIds);
returns an empty array;
any idea why? probably the way I build the index?
...
$companies = $query->get();
foreach ($companies as $item) {
$recordId = $item->id;
$foundIds[] = $recordId;
$itemData = [
'capabilities' => $item['capabilities']->map(function ($value) use ($item) {
return $value->name;
})->toArray(),
'certificates' => $item['certificates']->map(function ($value) use ($item) {
return $value->name;
})->toArray(),
'tacs' => $item['tacs']->map(function ($value) use ($item) {
return $value->name;
})->toArray(),
'industries' => $item['industries']->map(function ($value) use ($item) {
return $value->name;
})->toArray(),
'enterprise_type' => $item->enterprise_type ? $item->enterprise_type->name : null,
'country' => $item->country ? $item->country->name : null,
];
$searchIndex->addRecord($recordId, $itemData);
}
$search = new Search($searchIndex);
$result = [
'filters' => $this->findFilters($search, $filters, $foundIds),
'results' => ['data' => $results, 'count' => count($results), 'limit' => $pageLimit],
];
It looks like a bug. Here is a fix for this case (rare use case). https://github.com/k-samuel/faceted-search/commit/f52b904d61a0f54c06af1b5b7bb5940859da6284
I am planing to release of the new version of the library with fix today at 20:00 Moscow time.
Thank you for finding the problem and helping to improve the project.
awesome! i updated to 1.3.4
and the acceptable filters are now available. Thanks @k-samuel
Is the v2.0
ready for use? What does it bring?
It changes index structure (incompatible change) https://github.com/k-samuel/faceted-search/blob/master/changelog.md Yes, ready to go. We can start a new thread for discussion. If the question according to the database is closed.
Hi, would that be possible to use a DB as an index instead of a json file? Or I am misunderstanding something. My tech stack is MySQL 5.7.x, OctoberCMS/Laravel with Eloquent