openml / openml-r

R package to interface with OpenML
http://openml.github.io/openml-r/
Other
96 stars 37 forks source link

be able to upload data sets / tasks from R #78

Closed berndbischl closed 8 years ago

berndbischl commented 9 years ago

joaquin did a PR here. it contains 3 new files. these need to be reviewed

jakobbossek commented 9 years ago

Reviewed, changed to new API and merged. We need a test for that.

giuseppec commented 9 years ago

Uploading datasets is now possible, however, the server automatically assigns the status "in_preparation" so that we can't download the uploaded dataset (see https://github.com/openml/OpenML/issues/215). Uploading Tasks from R is not possible because there does not seem to be an API for this(https://github.com/openml/OpenML/wiki/API-v1), do we really need that?

joaquinvanschoren commented 9 years ago

The 'in preparation' status means that the dataset is not 'active' yet. I need to add a button on the website where you can change this status. Still, you should be perfectly able to download datasets that are not active?

A 'create task' API call will be added soon, hopefully this week.

On Thu, Sep 10, 2015 at 4:36 PM giuseppec notifications@github.com wrote:

Uploading datasets is now possible, however, the server automatically assigns the status "in_preparation" so that we can't download the uploaded dataset (see openml/OpenML#215 https://github.com/openml/OpenML/issues/215). Uploading Tasks from R is not possible because there does not seem to be an API for this(https://github.com/openml/OpenML/wiki/API-v1), do we really need that?

— Reply to this email directly or view it on GitHub https://github.com/openml/r/issues/78#issuecomment-139265784.

giuseppec commented 9 years ago

For all datasets with in_preparation something strange happens. Opening the directlink to the arff which is in preparation in the browser, e.g. http://www.openml.org/data/download/1673502/file1c6c255c496d.arff gives me the correct file (IF I AM LOGGED IN), but downloading the file with R (which works for active datasets) gives me a html file that tells me You don't have the right access rights to view this page..

filedir = tempfile(fileext = ".arff")
download.file("http://www.openml.org/data/download/1673502/file1c6c255c496d.arff", destfile = filedir)
readLines(filedir)
# [1] "<!doctype html>"                                                      
# [2] "<html lang=\"en\">"                                                   
# [3] "<head>"                                                               
# [4] "  <meta charset=\"utf-8\">"                                           
# [5] "  <title>Forbidden</title>"                                           
# [6] "</head>"                                                              
# [7] "  <body>"                                                             
# [8] "    <h3>403 Forbidden</h3>"                                           
# [9] "    <p>You don't have the right access rights to view this page. </p>"
#[10] "  </body>"                                                            
#[11] "</html>"  

I have a closer look at this. But does this make sense from server side? The same code works with an active dataset:

filedir = tempfile(fileext = ".arff")
download.file("http://www.openml.org/data/download/61/dataset_61_iris.arff", destfile = filedir)
head(readLines(filedir))
#[1] "% 1. Title: Iris Plants Database"                                 
#[2] "% "                                                               
#[3] "% 2. Sources:"                                                    
#[4] "%      (a) Creator: R.A. Fisher"                                  
#[5] "%      (b) Donor: Michael Marshall (MARSHALL%PLU@io.arc.nasa.gov)"
#[6] "%      (c) Date: July, 1988"    
giuseppec commented 9 years ago

I think I know whats the problem: 1) If you use your internet browser and are not logged in openml, you will get the same 403 Forbidden for the url (you have to first Sing Off from openml.org): http://www.openml.org/data/download/1673502/file1c6c255c496d.arff (which is the data set from http://www.openml.org/d/1775 which is in_preparation). 2) Datasets that are active can be downloaded even if you are not logged in, try for example this link: http://www.openml.org/data/download/61/dataset_61_iris.arff

berndbischl commented 8 years ago

Not now, maybe in 1.2