WillKoehrsen / wikipedia-data-science

Working with and analyzing Wikipedia Data
692 stars 287 forks source link

Wrong link on Wiki to the notebook to download the data #5

Open kuhnen opened 5 years ago

kuhnen commented 5 years ago

The Wiki has the wrong link to the file https://github.com/WillKoehrsen/wikipedia-data-science/blob/master/notebooks/Downloading%20and%20Parsing%20Wikipedia%20Articles.ipynb the file is on master only the link to redirect the file on the Wiki is not working.

WillKoehrsen commented 5 years ago

Thanks for noticing. The Wikimedia dumps are constantly updated to reflect the newest version of Wikipedia. If you could make a PR updating the links, that would be great. It could point to here: https://dumps.wikimedia.org/enwiki/20190901/ for example (although this will eventually go out-of-date as well).

moinmir commented 4 years ago

I just noticed another error. While downloading the files, the name, as well as the origin, is wrong. It should be:

data_paths.append(get_file(path, dump_url + file))

instead of data_paths.append(get_file(file, dump_url))