pisa-engine / pisa

PISA: Performant Indexes and Search for Academia
https://pisa-engine.github.io/pisa/book
Apache License 2.0
942 stars 65 forks source link

Query refactoring #561

Closed elshize closed 10 months ago

elshize commented 11 months ago

Weights are now stored together with term IDs and resolved at construction time according to one of the policies. In our tools, we use the default policy that removes duplicates and sets the weight to the number of occurrences of the term in a query. Other policies are, for the time being, only available programmatically via the library API.

Some legacy code used to parse and process queries has been removed in favor of the text analyzer and the new query parser.

Because weights are resolved when a query object is created, I also refactored creating the cursors: now the weight is simply taken from the query.

Fixes #501

codecov[bot] commented 11 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (994d101) 93.21% compared to head (30ee2a7) 93.23%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #561 +/- ## ========================================== + Coverage 93.21% 93.23% +0.02% ========================================== Files 91 90 -1 Lines 4483 4452 -31 ========================================== - Hits 4179 4151 -28 + Misses 304 301 -3 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

elshize commented 10 months ago

Right, I don't think thresholds should be any different than queries, but I'll give it another look.

Good idea about regression tests. There's a test docker image that I created for that, but haven't finished. Maybe it's a good idea to continue with it to make it easier to repeat in the future (or even automate).

elshize commented 10 months ago

I need to fix the conflicts, and after that, I'll run the docker that was just merged in the other PR to evaluate and see if there's any regression.

elshize commented 10 months ago

@JMMackenzie The regression test was successful. Are you ok merging it?

JMMackenzie commented 10 months ago

Great, let's merge!