Closed stevennic closed 5 years ago
Merging #543 into master will increase coverage by
0.01%
. The diff coverage is100%
.
@@ Coverage Diff @@
## master #543 +/- ##
==========================================
+ Coverage 83.01% 83.02% +0.01%
==========================================
Files 132 132
Lines 29621 29641 +20
==========================================
+ Hits 24589 24610 +21
+ Misses 5032 5031 -1
Impacted Files | Coverage Δ | |
---|---|---|
src/whoosh/codec/whoosh3.py | 95.09% <100%> (+0.01%) |
:arrow_up: |
tests/test_writing.py | 99.36% <100%> (+0.02%) |
:arrow_up: |
src/whoosh/writing.py | 86.8% <100%> (+0.09%) |
:arrow_up: |
src/whoosh/fields.py | 81.62% <0%> (+0.14%) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update f846e4a...e31f17e. Read the comment docs.
thanks for your code!
Fix for #375 : "Called start_doc when already in a doc"
I encountered this problem when adding to Whoosh from inside a pandas apply() function. Pandas does a first pass that absorbs exceptions.
This has also been observed by others, such as: https://github.com/django-haystack/django-haystack/issues/952
Reproduction:
Expected Result: An appropriate exception for the problem with document 2:
Exception: 2 is not unicode or sequence
Actual Result:Exception: Called start_doc when already in a doc
When the first document fails, the exception is not allowed to bubble up by the repro code. However, this leaves the PerDocumentWriter in an unfinished state (
_indoc==True
). When the second document add is attempted, the writer finds itself in an invalid state and throws theCalled start_doc when already in a doc
exception, which is cryptic and meaningless to the user.The fix ensures that the PerDocumentWriter is properly cleaned up in the event of a failure by implementing its own try/except block. This allows the correct exception to bubble up for the second problem.