atk4 / ui

Robust and easy to use PHP Framework for Web Apps
https://atk4-ui.readthedocs.io
MIT License
443 stars 106 forks source link

GRID quick search support for multiple words #1911

Open mkrecek234 opened 2 years ago

mkrecek234 commented 2 years ago

You want to be able to search for things like john AcMe Corp. in a crud's quick search box.

if quicksearch is set-up to cover fields first name, last name and company name, then it should return the correct customer. So we need a word-based aggregation of AND conditions, so conceptually

WHERE (LASTNAME LIKE %john% OR FIRSTNAME LIKE %john% OR COMPANYNAME LIKE %JOHN%) AND (LASTNAME LIKE %AcMe% OR FIRSTNAME LIKE %AcMe% OR COMPANYNAME LIKE %AcMe%) AND ...

If it would be really nicely made, we could concatenate search string, so searching for: john "AcMe Corp."

would search for (LASTNAME LIKE %john% OR FIRSTNAME LIKE %john% OR COMPANYNAME LIKE %JOHN%) AND (LASTNAME LIKE % AcMe Corp.% OR FIRSTNAME LIKE %AcMe Corp.% OR COMPANYNAME LIKE %AcMe Corp.%) AND ...

Not complex, but powerful.

mvorisek commented 2 years ago

Currently regex (regexp in mysql) can be used, is there anything againt using it or is some UI filter missing an integration?

mkrecek234 commented 2 years ago

If you search for "John AcMe" today, it looks for (LASTNAME LIKE %John AcMe% OR FIRSTNAME LIKE %John AcMe% OR COMPANYNAME LIKE %John AcMe%)

and won't find any record becaus John is a first name and Acme is a company name. Each word should be in at least one of the search fields to have a match.

mvorisek commented 2 years ago

šŸ‘ - it is about "crud quick search" where there is no option to select the matching mode

mvorisek commented 2 years ago

in some components the search is done on title field only, in others - in atk4/ui - these two:

the search is done using the whole input and on every column

  1. the search in all components should be unified and not done only on a title field
  2. with a lot of data, there will much more results - we should also implement some sorting - match of two consequent words is stronger than match of two words far apart, in two separate columns, word matched only in a middle (Jack vs. Ack company when searching by Ack)
  3. performance must be ok even with millions of records
  4. maybe implement the logic in atk4/data
  5. matches in UI should be highlighted
  6. globalized data must be accepted (like 1,000 number stored as 1000)
mkrecek234 commented 2 years ago

I think the Grid/Crud logic is the important first one; it is configurable today already, so addQuickSearch can search in defined columns only.

For lookup, indeed this can also be handy to decide which columns are searched in - and also to have multiple word-based results.

At least in my use cases it is never slow. For lookup you can set the minimum letters before searching is started and the maximum limit of results easily. For grid it is started anyhow only on pressing enter, and the results are limited by paginator.

mkrecek234 commented 1 year ago

@mvorisek Fuzzy search with an index how relevant the search result is to have a better sort order, is very nice for sure. However, search engines typically weigh quoted words like "acme food" as higher and a correct match as more relevant than if you had not put it in quotes, so: "acme food" germany logically higher, acme food germany.

Up for proposals I'd say, how we can code it easily

mvorisek commented 1 year ago

some scoring theory relevant links:

https://fusejs.io/concepts/scoring-theory.html

https://github.com/Genivia/FuzzyMatcher

https://www.algolia.com/doc/guides/managing-results/optimize-search-results/typo-tolerance/

https://docs.meilisearch.com/learn/configuration/typo_tolerance.html#configuring-typo-tolerance

(and others using trigram, but we cannot use it on not-preindexed data)

mkrecek234 commented 1 year ago

Arenā€˜t the full text pattern search function of any database far more effective than anything we can build here? The full text MATCH function in Sql might be an approach. I would actually prefer to cover only multiple word search by quotation plus negation of words/quoted search term. That seems more than enough - another approach would IMHO need a lot of work and time for such a small feature.

Another proposal: We should move the multi-word logic that builds the result scope into model class: it is not only needed for grid.

mvorisek commented 1 year ago

Arenā€˜t the full text pattern search function of any database far more effective than anything we can build here? The full text MATCH function in Sql might be an approach. I would actually prefer to cover only multiple word search by quotation plus negation of words/quoted search term. That seems more than enough - another approach would IMHO need a lot of work and time for such a small feature.

in theory yes, in reality not, as not all DB vendors support it, nor support it on all column types, nor with all collations

Another proposal: We should move the multi-word logic that builds the result scope into model class: it is not only needed for grid.

mentioned already, yes, into atk4/data preferably