brightway-lca / brightway2-data

Tools for the management of inventory databases and impact assessment methods. Part of the Brightway LCA framework.
https://docs.brightway.dev/
BSD 3-Clause "New" or "Revised" License
8 stars 21 forks source link

feat: sqlite full text search #177

Closed will7200 closed 1 month ago

will7200 commented 1 month ago

Closes #103

This aims to be a drop in replacement for whoosh. with little to no API change for our customers.

  1. Uses FTS4 as FTS5 requires some escaping of the search terms
  2. Since the underlying search implementation strips out certain characters out during tokenization, these characters that might be of value to the customer are lost. Added functionality for the mask and filter operations to allows users to pass a callable to control their search better. Operations supported are those of sqlite3 column operators wrapped in pythonic operations. Examples:
    db.search("lollipop", mask={"product": lambda col: col == 'ZEBRA-2'})
    db.search("lollipop", mask={"product": lambda col: col.like('%ZEBRA-2%')})
    db.search("lollipop", mask={"product": lambda col: col.in_(['ZEBRA-2', 'ZEBRA-1'])})
codecov[bot] commented 1 month ago

Codecov Report

Attention: Patch coverage is 90.00000% with 16 lines in your changes missing coverage. Please review.

Project coverage is 83.24%. Comparing base (802dafe) to head (0c61d18). Report is 14 commits behind head on main.

Files Patch % Lines
bw2data/updates.py 22.22% 7 Missing :warning:
bw2data/search/indices.py 95.95% 4 Missing :warning:
bw2data/search/schema.py 87.87% 4 Missing :warning:
bw2data/search/search.py 94.44% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #177 +/- ## ========================================== + Coverage 83.07% 83.24% +0.17% ========================================== Files 39 39 Lines 3609 3730 +121 ========================================== + Hits 2998 3105 +107 - Misses 611 625 +14 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

cmutel commented 1 month ago

@will7200 Awesome! One small comment which is an easy fix, and then fix some test failures, and we should be GTG!

will7200 commented 1 month ago

@cmutel fixed that and found the underlying causing those test failures. The last two remaining windows failures are due to:

PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\tmpgwrprhc7\\dxbqkfxeszgknljeyz.46382e12\\lci\\databases.db'

Which looks like has happened in past github runs.