Closed zzk2021 closed 1 month ago
retrieved_docs
is returned by the ColBERT engine. doc[0] is the index of document that the engine retrieves.
The index refers to the index in passage_contents
. For example, if the doc_id is 0, then the retrieved document is the first document in passage_contents
.
You have out of range index possibly because you changed the corpus size len(passage_contents)
but you haven't regenerate the index. You should delete the previously generated index files and rerun the script with --run_indexing
.
I look into the code, I do not understand the doc_idx, is the tokenizer of query?
when I runing, it raises error.