castorini / bertserini

BERTserini
https://github.com/castorini/bertserini
Apache License 2.0
25 stars 10 forks source link

Wikipedia index not available for download (wget command failed) #17

Closed MorenoLaQuatra closed 2 years ago

MorenoLaQuatra commented 2 years ago

Hi,

First of all I wish to thank you for your research efforts and the repo. I want to ask you if it is possible to fix the index download since the provided wget command is not working. Hereafter the output of the command:

--2021-12-27 18:54:22--  http://72.143.107.253/BERTserini/english_wiki_2018_index.zip
Connecting to 72.143.107.253:80... failed: Connection timed out.
Retrying.

--2021-12-27 18:56:32--  (try: 2)  http://72.143.107.253/BERTserini/english_wiki_2018_index.zip
Connecting to 72.143.107.253:80... failed: Connection timed out.
Retrying.

--2021-12-27 18:58:44--  (try: 3)  http://72.143.107.253/BERTserini/english_wiki_2018_index.zip
Connecting to 72.143.107.253:80... failed: Connection timed out.
Retrying.
amyxie361 commented 2 years ago

The old index is deprecated, please apply the new address: https://rgw.cs.uwaterloo.ca/JIMMYLIN-bucket0/pyserini-indexes/lucene-index.enwiki-20180701-paragraphs.tar.gz