Open finmod opened 8 years ago
@jameshensman do you still have the files?
I tried to move most of these types of things across as I found them. Certainly spellman is in pods, but I'm not sure about drosophila.
It's a good example of why we developed pods!
If we can recover the datasets let's try and get them integrated.
On Sun, Mar 6, 2016 at 10:27 AM, Max Zwiessele notifications@github.com wrote:
@jameshensman https://github.com/jameshensman do you still have the files?
— Reply to this email directly or view it on GitHub https://github.com/SheffieldML/notebook/issues/8#issuecomment-192867094.
Is there any news on the drosophila data?
No, I established that using pods is better than using GPy.utils to access the dataset files. This is with GPy-devel. All in all, I managed to put a complete folder "datasets" from various sources and packages in SheffieldML. Hence, I managed to form the drosophila.knirps file required by Hierarchical.ipynb and eliminate direct access to Lab3 in that notebook.
That's great. yes pods is the right place to do this.
Did you do a pull request for an updated version of the notebook?
On Tue, May 3, 2016 at 3:09 PM, finmod notifications@github.com wrote:
No, I established that using pods is better than using GPy.utils to access the dataset files. This is with GPy-devel. All in all, I managed to put a complete folder "datasets" from various sources and packages in SheffieldML. Hence, I managed to form the drosophila.knirps file required by Hierarchical.ipynb and eliminate direct access to Lab3 in that notebook.
— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/SheffieldML/notebook/issues/8#issuecomment-216633879
Here's the drosophila data if someone wants to add it. dros.zip
Thank you James for this data file. With the kalinka09_mel.csv and kalinka09_mel_pdata.csv files extracted into the compbio folder, Hierarchical.ipynb is now running fine.
Note that kalinka09_mel is a lighter version than the one I downloaded from the original source using pods.
To recap the fix:
1) Extract the two kalinka09 files to the compbio folder;
2) Comment out urllib in hierarchical.ipynb as follows:
expression = np.loadtxt('kalinka09_mel.csv', delimiter=',', usecols=range(1, 57))
gene_names = np.loadtxt('kalinka09_mel.csv', delimiter=',', usecols=[0], dtype=np.str)
replicates, times = np.loadtxt('kalinka09_mel_pdata.csv', delimiter=',').T
expression -= expression.mean(1)[:,np.newaxis]
expression /= expression.std(1)[:,np.newaxis]
Running the complete (8 out of 8) compbio folder requires a similar availability of a data file for
Y=np.load("/users/suraalrashid/expression.npy") in TFA_with_Coregion-1.ipynb.
I could not locate the suraalrashid data anywhere.
From: James Hensman [mailto:notifications@github.com] Sent: Thursday, May 5, 2016 9:14 AM To: SheffieldML/notebook notebook@noreply.github.com Cc: finmod denis.richard@dr.com; Author author@noreply.github.com Subject: Re: [SheffieldML/notebook] Accessing "staffwww.dcs.sheffield.ac.uk/people/J.Hensman" data (#8)
Here's the drosophila data if someone wants to add it. dros.zip https://github.com/SheffieldML/notebook/files/250051/dros.zip
— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/SheffieldML/notebook/issues/8#issuecomment-217091337 https://github.com/notifications/beacon/AMHyIlwuQNLm8-kfUZU3-5vi0rDxy-UAks5p-ZitgaJpZM4G1oVA.gif
Hello James,
As a logical step after running hierarchical.ipynb, in deepGPy (configuration: Linux (Ubuntu) on VM VirtualBox, python 2.7 and Anaconda 2.5), two questions arise about plotting:
The Nested Deep GP.ipynb stops abruptly on the production of Fig 4 in the paper on the robot wireless data. This issue has been raised as issue #5 in deepGPy;
Same problem with Figure 3 of the two dimensional toy demo in the Gaussian Processes with Big Data paper.
It would be nice if you could make available the code for these two plots because they convey a telling message for otherwise complex processes.
Thank you.
From: James Hensman [mailto:notifications@github.com] Sent: Thursday, May 5, 2016 9:14 AM To: SheffieldML/notebook notebook@noreply.github.com Cc: finmod denis.richard@dr.com; Author author@noreply.github.com Subject: Re: [SheffieldML/notebook] Accessing "staffwww.dcs.sheffield.ac.uk/people/J.Hensman" data (#8)
Here's the drosophila data if someone wants to add it. dros.zip https://github.com/SheffieldML/notebook/files/250051/dros.zip
— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/SheffieldML/notebook/issues/8#issuecomment-217091337 https://github.com/notifications/beacon/AMHyIlwuQNLm8-kfUZU3-5vi0rDxy-UAks5p-ZitgaJpZM4G1oVA.gif
There is a common problem on accessing compbio and other datasets: drosophilia, spellman yeasts, Lab3.zip and others. This is in addition to migrating matplotlib and pods to Python 3. Should'nt these datasets be integrated nicely in pods to provide an homogeneous set of testing notebook (gprs, gpss) and "datasets" folder?
The error is: C:\Users\Denis\Anaconda3\lib\urllib\request.py in http_error_default(self, req, fp, code, msg, hdrs) 587 class HTTPDefaultErrorHandler(BaseHandler): 588 def http_error_default(self, req, fp, code, msg, hdrs): --> 589 raise HTTPError(req.full_url, code, msg, hdrs, fp) 590 591 class HTTPRedirectHandler(BaseHandler):
HTTPError: HTTP Error 403: Forbidden