whoosh-community / whoosh

Whoosh is a fast, featureful full-text indexing and searching library implemented in pure Python.
Other
245 stars 37 forks source link

Fix 472: Include dynamic fields in optimization #537

Closed stevennic closed 5 years ago

stevennic commented 5 years ago

Fix #472. When optimizing, the OPTIMIZE function adds a SegmentReader to the writer (writing.py:115) that it uses to build the new, optimized segment. At writing.py:688, write_per_doc collects all matching fieldnames from the schema, but neglects to also consider matching stored dynamic fields in the documents it reads. The modification appends all matching stored dynamic fields to the set for this loop so they can be included in the reader.

I'm not completely sure there isn't a better place for this logic, especially given the warning at the beginning of write_per_doc indicating the whole thing might not be implemented in the best way, but debugging led me here and all the tests pass. I recommend additional system testing on existing code that utilizes dynamic fields too.

codecov[bot] commented 5 years ago

Codecov Report

Merging #537 into master will increase coverage by <.01%. The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #537      +/-   ##
==========================================
+ Coverage      83%   83.01%   +<.01%     
==========================================
  Files         132      132              
  Lines       29610    29621      +11     
==========================================
+ Hits        24578    24589      +11     
  Misses       5032     5032
Impacted Files Coverage Δ
src/whoosh/writing.py 86.71% <100%> (ø) :arrow_up:
tests/test_indexing.py 100% <100%> (ø) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 94d6da2...bef1531. Read the comment docs.

fortable1999 commented 5 years ago

Thanks! I'd like merge it.