castorini / bertserini

BERTserini
https://github.com/castorini/bertserini
Apache License 2.0
25 stars 10 forks source link

Use pyserini's new prebuilt index features #10

Closed qguo96 closed 3 years ago

qguo96 commented 3 years ago

I tested with the Development Installation of pyserini. (pyserini's current PyPI version doesn't support prebuilt index) This modification can take advantage of pyserini's new prebuilt index features, but we need to merge this PR(https://github.com/castorini/pyserini/pull/235) first. Then I can add information about lucene-index.enwiki-20180701-paragraphs.tar.gz and https://www.dropbox.com/s/6zn16mombt0wirs/lucene-index.zhwiki-20181201-paragraphs.tar.gz?dl=0 in pyserini.

MXueguang commented 3 years ago

@qguo96 can you update corresponding document too? i.e. in the README.md Simple QA example, and the Chinese example, with the prebuilt index feature

lintool commented 3 years ago

@MXueguang Can you confirm that we are able to replicate the EM scores? Let's change the documentation after Pyserini PyPI update is published.

@qguo96 Please send PR in Pyserini that adds these two indexes?

qguo96 commented 3 years ago

@MXueguang Sure, it's better to change docs after a stable PyPI release. @lintool get it. I will add the information in this PR (castorini/pyserini#235).

qguo96 commented 3 years ago

I add information about enwiki and zhwiki in this PR (castorini/pyserini#235).