Closed huiwudiyi closed 5 years ago
Thanks for pointing this out. And sorry for the late reply.
Yes, you should switch document_token and title_token. We'll revise our code. We did some further experiments after the change. The results are the same.
First,thanks for sharing your code when I read your code ,I found a place where I do not unstand. In the file 'utile.py' function : load_documents() passage['tokens'] = document_token + ['|'] + title_token
but in the function 'index_document_entities(): you said that word_ids are off by (title_len + 1) because document is concatenated after title, and with an extra '|'
should I switch the positons of document_token and title_token?