isoos / query

Search query parser library in Dart
BSD 3-Clause "New" or "Revised" License
20 stars 6 forks source link

Implement phrase search and nested grouping #3

Closed edyu closed 4 years ago

edyu commented 4 years ago

I mainly did 3 changes:

  1. Bump the version for petitparser to 3. (No changes were needed) 2 Implement phrase search that instead of treating everything inside quotes as a string, breaking it into a list of TextQuery.
  2. Implement nested grouping using parentheses.
isoos commented 4 years ago

Implement phrase search that instead of treating everything inside quotes as a string, breaking it into a list of TextQuery.

What's the rationale behind that? "some word phrase" usually means that I want results that contain some word phrase verbatim in their body, exactly like it was specified in quotes. It makes more sense to me to keep the queried value in one String.

Implement nested grouping using parentheses.

This was already there: https://github.com/isoos/query/blob/master/lib/src/grammar.dart#L68

What changed?

edyu commented 4 years ago

The current grouping doesn't work for top level. Instead it would treat '(' or ')' as part of the string. In fact, if '(' are used at top level, the parser would break.

edyu commented 4 years ago

I also forgot to mention that I also added ' | ' as a synonym for ' OR '.

edyu commented 4 years ago

For "some word phrase", this allows for proximity query such as "some * phrase". In addition, usually "some word phrase" actually means "some" followed by "word" and then by "phrase". This eases most search engine implementation. Spaces are treated specially in all query parsing. For example, ( hello world ) should be treated same as (hello world) so I'd rather that's true for "some phrase" and " some phrase ". For me phrase is not really exact match but more exact order.

isoos commented 4 years ago

@edyu: Would it be possible to split this into two (or three) PRs? I'd like to see the changes required for the grouping separately from the phrase parsing, and you could couple the version upgrade as you wish. (Maybe remove the phrase from this PR and create a new one for only that?)

On the grouping: I think it is critical that if we have a wrong query with the current codebase, we should do a test that fails with the current code and passes with the fixed code. Could you please add that to the tests? Also, if | is added, would it be possible to copy-paste some of the tests with OR and demonstrate that it is working?

I'm not yet sure how exact phrases should be parsed. Do you have a specific code that is treated better with this kind of parsing? Any examples how other libraries may be doing it?

edyu commented 4 years ago

I'm breaking everything into 4 commits to make it easier for you to see the changes.

edyu commented 4 years ago

I intentionally bumped the version number so you don't forget to do so. :) You can change it however you'd like.

isoos commented 4 years ago

Thank you! I'll publish a new version soon.