SheffieldML / GPy

Gaussian processes framework in python
BSD 3-Clause "New" or "Revised" License
2.04k stars 562 forks source link

Olympic Sprints data download link broken #671

Open wilocw opened 6 years ago

wilocw commented 6 years ago

Trying to prepare a short notebook on multi output GPs for the summer school based on last years, but am unable to download the Olympic sprints data from either GPy.util.datasets or pods.datasets because the download url goes to a non-existent dropbox url (this may have expired or moved).

Downloading  https://www.dropbox.com/sh/7p6tu1t29idgliq/_XqlH_3nt9/firstcoursemldata.tar.gz -> /home/wil/ods_data_cache/rogers_girolami_data/firstcoursemldata.tar.gz
---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
~/anaconda3/lib/python3.6/site-packages/pods/util.py in download_url(url, dir_name, save_name, store_directory, messages, suffix)
     29     try:
---> 30         response = urlopen(url+suffix)
     31     except URLError(e):

These links could do with updating, preferably urgently, so I can add them to the notebook but if not can you direct me to the original file so I can just host it locally for the purpose of the summer school.

[Also highlighted https://github.com/SheffieldML/notebook/issues/15]

Cheers, Wil

lawrennd commented 6 years ago

Hi Wil,

Was just looking at Simon’s website for the book, which still lists dropbox as the key resource, but the data is also available here:

http://www.dcs.gla.ac.uk/~srogers/firstcourseml/firstcoursemldata.tar.gz

I’ll see if I can update the links.

Neil

On Fri, 31 Aug 2018 at 15:50, Wil O. C. Ward notifications@github.com wrote:

Trying to prepare a short notebook on multi output GPs for the summer school based on last years, but am unable to download the Olympic sprints data from either GPy.util.datasets or pods.datasets because the download url goes to a non-existent dropbox url (this may have expired or moved).

Downloading https://www.dropbox.com/sh/7p6tu1t29idgliq/_XqlH_3nt9/firstcoursemldata.tar.gz -> /home/wil/ods_data_cache/rogers_girolami_data/firstcoursemldata.tar.gz

HTTPError Traceback (most recent call last) ~/anaconda3/lib/python3.6/site-packages/pods/util.py in download_url(url, dir_name, save_name, store_directory, messages, suffix) 29 try: ---> 30 response = urlopen(url+suffix) 31 except URLError(e):

These links could do with updating, preferably urgently, so I can add them to the notebook but if not can you direct me to the original file so I can just host it locally for the purpose of the summer school.

Cheers, Wil

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/SheffieldML/GPy/issues/671, or mute the thread https://github.com/notifications/unsubscribe-auth/AAIKWqDUyrZ9Y-NWRcudA1Cdi4bVsMnlks5uWU1LgaJpZM4WVWnN .

lawrennd commented 6 years ago

@wilocw I've flagged the broken link with Simon, so with luck he'll be able address that.

I'll leave it up to you to decide whether changing it in data_resources.json (in GPy.util) is worthwhile or not ...

If you do change that, don't forget to remove the suffix in the suffices field as well!

wilocw commented 6 years ago

Oh, thanks for sorting that out. I'd meant to comment that I'd found the file yesterday but never clicked Comment. I've just extracted the relevant data and will upload it as a CSV with the notebook and relevant citation for now. I think it's perhaps too late to rely on updating GPy, since we sent out download instructions earlier this week. I'll look into it after the summer school, because I'm not so familiar with the internals of GPy.util. The MOGP notebook is going to be an "Extra work" anyway so this is low priority.

Cheers !