index() corruption - Githubissues

Hi, I have been trying to run the indexing on a set of 80 pdf documents (~150 pages each) by submitting batch jobs. Since the indexing took longer than expected (8 hours) my session ended abruptly and I get a "ValueError: Expected object or value" when I try to read from_index().

I don't see any method to discard the partially indexed document and continue from the last valid index. This would mean I need to start from the top for another 8+ hours. Is it possible to have some functionality to deal with this situation?

AnswerDotAI / byaldi

index() corruption #65