Closed ppodolsky closed 3 years ago
Failing at https://github.com/tantivy-search/tantivy/blob/main/src/store/reader.rs#L104
chechpoint
(doc=[14958..16689), bytes=[3471326..3478611)), doc_id
- 15086
Thanks I'll investigate on Monday
Le sam. 9 janv. 2021 à 03:23, Pasha Podolsky notifications@github.com a écrit :
Chechpoint with broken VInt (doc=[14958..16689), bytes=[3471326..3478611)), docID - 15086
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tantivy-search/tantivy/issues/973#issuecomment-756921355, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHZMQSW53U62X4XI7WJZCTSY5ETRANCNFSM4V2YURMA .
[14892..14909), bytes=[3453499..3456556))
(doc=[14909..14926), bytes=[3456556..3460848))
(doc=[14926..14942), bytes=[3460848..3464857))
(doc=[14942..14958), bytes=[3464857..3471326))
(doc=[14958..16689), bytes=[3471326..3478611)) <---
(doc=[16689..16724), bytes=[3478611..3486278))
(doc=[16724..16753), bytes=[3486278..3493484))
(doc=[16753..16787), bytes=[3493484..3500905))
(doc=[15087..15131), bytes=[3500905..3508456)) <---
(doc=[15131..15165), bytes=[3508456..3516084))
(doc=[15165..15196), bytes=[3516084..3523442))
(doc=[15196..15228), bytes=[3523442..3530761))
(doc=[15228..15256), bytes=[3530761..3538043))
The bug looks very similar.
Did you enable logging (warn level should be sufficient) and did you see a lot of merge fail before that?
I'd like to know if the assert in block.rs:l.47 triggered several times before you encounterred your problem.
I will check it today (around 10-12UTC) after getting to laptop.
On 9 Jan 2021, at 03:14, Paul Masurel notifications@github.com wrote:
Did you enable logging and did you see a lot of merge fail before that?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
What I can say now that this segment is definitely came from merging. It is too large to come from a single interval of writing.
On 9 Jan 2021, at 03:14, Paul Masurel notifications@github.com wrote:
Did you enable logging and did you see a lot of merge fail before that?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
@ppodolsky just to be sure, this is a brand new index.. meaning it did not contain segment that would have been corrupted previously?
@ppodolsky just to be sure, this is a brand new index.. meaning it did not contain segment that would have been corrupted previously?
Yep. I have rebuilt the whole index after applying latest commits from your main
branch. I will recheck everything today and launch writings with enabled logging if it is required. Looks like I will be able to reproduce the issue quickly.
Can you run your program with the following rev? acfb057462422db52f7800e954a5df2fceaf735a
It checks the doc store skip index while it is being written. If there is a problem, it detects it and return an error. tantivy then abruptly quit the process and logs the segments that were being merged.
The segment files are not removed so if you send them to me, I should be able to look at the issue. (the .store files are sufficient I think)
@ppodolsky
Sure, I will release this rev today. During last weekend nothing happened (but write load was lesser than usual). I continue to observe and write logs. Will keep you informed.
Thank you!
Still having no luck in the catch. I've begun to doubt in sanity of what was there, probably I or k8s had managed to launch previous version of Tantivy for a moment and it'd corrputed segment.
To excuse I'd like to say that during 3 days under rather heavy load there is not any corruption. I'm keeping watching with logging til the end of week and then will close the issue if won't find anything. Highly likely everything is OK and I've false-alarmed, sorry.
No worries! You have accumulated enough good Kharma by finding and spending time reporting the bug not to worry about that :)
Didn't get the corruption, so it was definitely my mistake. Under two weeks of various load profiles there have been no any signs of broken segs. Thank you for being patient :)
Thanks for the update!
Describe the bug The same load profile as in #969 - deletions, addings and mergings. Now it happens on querying after several hours of serving. I think the reason is basically the same. At startup and during several hours afterwards all queries were ok but after generations of merges
searcher.doc
started to throwVInt decoding
error.Which version of tantivy are you using? https://github.com/tantivy-search/tantivy/commit/bf6e6e8a7cc1826212ba2500b08ecb53dfcdeda1
To Reproduce Sent broken segment to you in gitter.