Closed FDrAe86 closed 3 years ago
PATH_TO_INDRI_INDEX is the the path to an indri index for the target collection. Since the document content is stored when indexing, you can use this to get a copy of the text for use by the models. You shouldn't need to modify extract_docs_from_index.py -- you just need an index built.
OK,I'll try it. I appreciate your response,Thank youI!!
Has your problem been solved? I have the same problem as you. Could you please reply me at your convenience?
About running this instruction:
Indri index
awk '{print $3}' data/robust/*.run | python extract_docs_from_index.py indri PATH_TO_INDRI_INDEX > data/robust/documents.tsv I got a problem. What's PATH_TO_INDRI_INDEX ?,should I modify any code in extract_docs_from_index.py? I run this py file and the error is
the following arguments are required: index_type, index_path
Thanks for your great work!!