Open kuhnen opened 5 years ago
Thanks for noticing. The Wikimedia dumps are constantly updated to reflect the newest version of Wikipedia. If you could make a PR updating the links, that would be great. It could point to here: https://dumps.wikimedia.org/enwiki/20190901/ for example (although this will eventually go out-of-date as well).
I just noticed another error. While downloading the files, the name, as well as the origin, is wrong. It should be:
data_paths.append(get_file(path, dump_url + file))
instead of data_paths.append(get_file(file, dump_url))
The Wiki has the wrong link to the file https://github.com/WillKoehrsen/wikipedia-data-science/blob/master/notebooks/Downloading%20and%20Parsing%20Wikipedia%20Articles.ipynb the file is on master only the link to redirect the file on the Wiki is not working.