rollerworks / search

PHP search-systems made possible
https://rollerworkssearch.readthedocs.io/en/latest/index.html
MIT License
109 stars 17 forks source link

[Input] SmartQuery #23

Closed sstok closed 4 years ago

sstok commented 10 years ago

StringQuery is great for complex search-conditions but sometimes you just want to search fast and let the system figure it out (be smart).

Quick notes

SmartQuery 2014-06-25 ; male ; active is the same as StringQuery:

date: "2014-06-25"; gender: male; status: active;

or (when '2014-06-25' is ambiguous)

gender: male; status: active; *(bday: "2014-06-25"; regdate: "2014-06-25"; )

Multiple ambiguous groups (* = OR-fields, {} indicating an AND-group).

gender: male; { *(bday: "2014-06-25"; regdate: "2014-06-25"); *(status=active; only_active_products=true) }

Note. AND-group is not supported at the moment and requires multiple change throughout the system.

Input processor

The SmartQuery works by processing each value in the list trough a ValueProcessor. A ValueProcessor first checks if the value is supported by the processor ( acceptsValue($value, $conditionType) ) and when thats the case the value gets passed to processValue(ValuesBag $values, $value) which adds the value to the values-bag (any type, determined by the method).

ValueProcessor

It's the job of the ValueProcessor to determine how the value should be added, 2014 can be seen as a year (a date-range) or as an integer (single-value). But >2014 is 'everything higher then the year 2014' and thus a comparison-value.

To help with this, the SmartQuery input-processor also supports a condition-type for each value which explicitly defines how the value must be processed.

Ambiguous values

Some values can be ambiguous and may not necessarily match only one processor. In this case the Input processor continues with other processors, and keeps track of which fields also support the value. If the number of fields is higher then 1 the value is added to each supported field, and the group holding the fields is marked as OR-fields, meaning that at least one field must match, but not all must match.

If there are multiple ambiguous groups, the groups are placed in subgroup which is marked as AND-group making all the groups AND cased to each other.

Values that are not ambiguous are kept in the head-group.

sstok commented 7 years ago

147 introduced a new string lexer which is the first step into making this special input processing possible!

But I plan to not limit it to a string only, it will be possible to use any input format including XML and JSON. Once the major work for v2.0-alpha1 is done the work can start.

dkarlovi commented 7 years ago

@sstok I think this is a proper place to put this: if a user is trying to search anything, how can that be done?

For example, let's say I have a bunch of articles. I'd like to find stuff about Paris so I can do

title: ~*paris; body: ~*paris; etc: ~*paris; date: > 2010-01-01

What I'd actually want is to say

paris; date: > 2010-01-01

What would be a good way to do this? It would basically be an unbound query which could/should be bound like above. Would what even be possible?

Edit: Elasticsearch supports that concept with the default_field property.

dkarlovi commented 6 years ago

@sstok what do you make of the previous comment? Could we have something like that?

sstok commented 6 years ago

I am down with the flu 😞 I will get back to this once I am feeling better.

dkarlovi commented 6 years ago

Hopefully you're better. Don't worry, the work will be here waiting for you. :laughing:

sstok commented 4 years ago

Closing this one in favor of #230 the original idea is still a good one, but also requires a lot of complexity to get this working. And other than having a default field I don't believe automatically mapping values to various fields will give you the expected result.