Closed neilisaac closed 4 years ago
Return a specific error (ErrCorruptedSegment) when a corrupted segment file is discovered.
Thank you to Neil Isaac for this. Sorry Neil, I should have put your name into the commit.
@neilisaac did you see that a test failed? I've been too busy to look into it. Would you be interested in having commit access to the project?
Addressed on a separate branch (now merged and working.) I'm happy to review other changes on the repo, or do release tagging if you'd like.
To answer your first question, I'm planning to handle recovery from the application by deleting corrupted files (since the data in this queue isn't particularly important to this particular application if we have a power loss, which we're assuming would be the main reason for corruption). I don't have any other error handling changes in mind assuming this approach works as expected.
We may want to implement https://github.com/joncrlsn/dque/issues/13 soon to address consumer throughput.
Letting the application handle the corruption is probably the best solution. You wouldn't be happy if a whole segment just disappeared without any notification. When ActiveMQ would get corrupted, we would just delete the entire queue directory and start over. It wasn't all that important to us.
The application would have to delete the file and then recreate the in-memory queue and make sure the old in-memory queue is discarded forever. And I think that will work as long as you are deleting only the head or the tail segment. Deleting a segment in the middle would cause an error because it's expecting the segment numbers to be sequential.
In order to handle automated recovery from corrupted segment files, this PR adds an explicit error type including the file path for relevant errors.