Open GiantCrocodile opened 1 year ago
I don't think there is a good way currently to resolve layout and blocks field content first before search, so that their json markup doesn't accidentally is matched. For that, we would need to load the blueprints and check which field is what type. Currently, this would be quite the performance drag.
We could revisit this once our blueprints are refactored and thus not as performance heavy.
Of if anyone has another idea, please add it.
I've an idea 💡
Actually, the right thing is to read the blueprint as @distantnative said, but this will cause serious performance problems. Instead, I think we can easily solve this issue by specifying the data type in the search as we use ->toStructure()
, ->toBlocks()
for field output. Here rough example:
$results = $site->search($query, 'title|adresses:yaml|text:json|tags');
What do you think? We can decide the syntax and data type names later.
I like the idea. It could even be useful in the long run (parsing the blueprints will still be slower than explicit configuration, even after the refactoring). The syntax is great IMO.
Sounds good and sounds more stable than auto-detecting something. So I think the "needs: delay" tag can be removed.
$results = $site->search($query, 'title|adresses:yaml|text:json|tags');
Following the example, I got the $field
for text
and now my search components sees the type specified as json
, what does it do now to retrieve a string it can use for searching?
I see what you mean. Yes, it can be difficult to extract the relevant field data from JSON. For that may need ->toString()
method for each data handler. When the search is called, I think the following code would work: Data::decode($field->value())->toString()
Or it may be better to specify the field type directly, not sure: 'title|adresses:structure|text:blocks|tags'
We could convert the values into a cleaned string, maybe like so:
Data::decode($value, $type)
).Rough code for 2:
function convert(array $array): string {
unset($array['id']);
unset($array['type']);
// ...
foreach ($array as $key => $value) {
if (is_array($value) === true) {
// recurse
$array[$key] = convert($value);
}
}
return implode(' ', array_values($array));
}
Or it may be better to specify the field type directly, not sure:
'title|adresses:structure|text:blocks|tags'
I like that. It would allow us to fine-tuned the ignored keys per field type.
@lukasbestle I think this is the right direction. However, I would suggest to rather have dedicated search methods in the respective classes (Block, StructureObject) which are called from the component when specified. That way those methods can much better decide what fields can be thrown out for their own datatype.
Makes sense 👍
Is this something that could be part of v4.2 or v4.3?
Description
I have a search function to search articles in my blog. This works in general. I just noticed that a specific 2 word search term results in 2 results where I would expect just 1 result.
Search function:
I checked the
article.de.txt
file that gets found and I see that the metadata fieldid
of the blocks field text contains the search string by chance. From a technical perspective it makes sense that the search function finds this. From an user perspective it doesn’t make sense because the article doesn’t contain the 2 chars long search phrase so this can’t be what the user actually searches for.Expected behavior
The metadata is being ignored and thus the search results are what the user will expect.
Your setup
v3.9 and v4.0 alpha
Additional context
see https://forum.getkirby.com/t/page-search-returns-unexpected-result/29317/2?u=warg