confused about the doc_idx，when I split passage_ds and ds, doc_idxis vary large

retrieved_docs is returned by the ColBERT engine. doc[0] is the index of document that the engine retrieves. The index refers to the index in passage_contents. For example, if the doc_id is 0, then the retrieved document is the first document in passage_contents. You have out of range index possibly because you changed the corpus size len(passage_contents) but you haven't regenerate the index. You should delete the previously generated index files and rerun the script with --run_indexing.

LinWeizheDragon / FLMR

confused about the doc_idx，when I split passage_ds and ds, doc_idxis vary large #13