Kaggle / kaggle-api

Official Kaggle API
Apache License 2.0
6.21k stars 1.09k forks source link

Improve API design: proper python library not just CLI #23

Open morenoh149 opened 6 years ago

morenoh149 commented 6 years ago

Since many teams collaborate using python notebooks it may be useful to expose a python module interface which allows users to specify their config with variables and allow downloading of competition datasets. Good idea?

ghost commented 6 years ago

@morenoh149 do you mean being able to use the API within kernels? If so, that is on my long term to do list, but I don't know when it will be possible.

morenoh149 commented 6 years ago

no, I'd like to use it within notebooks (colab included) to load kaggle datasets into memory. Mainly allowing the library to be used by passing in variables for the secret keys instead of relying on a file with the secret keys (the way it currently works).

tonypang1228 commented 5 years ago

I have the same request as well. Currently I'm using python to capture and process data before upload and run the kaggle kernels. It will be nice if I can use the API from python directly rather than CLI to upload and run the kernels. Thanks.

mirekphd commented 5 years ago

@morenoh149 : maybe you could rename the issue into e.g. "Improve API design: proper python library not just CLI" ? It is very limited currently, as it forces you to switch away from python to bash for any automation work...

mirekphd commented 5 years ago

The best I could do to incorporate Kaggle API calls (here: submitting and getting list of submissions) into my python code was to use os.system(), like this:

### submit model predictions ###

import os

# submit model predictions file (-f) using its name as annotation (-m)
cmd = 'kaggle competitions submit ' + competition_name + \
    ' -f ' + local_scores_path+cur_preds_file + ' -m ' + cur_preds_file
print(cmd)

# invoke the Kaggle API CLI using a system call
returned_value = os.system(cmd)  

print('Returned value:', returned_value)
assert(returned_value==0)

### retrieve submissions list ###

import pandas as pd

# obtain recent submissions list and save it to a temp CSV file
cmd = 'kaggle competitions submissions ' + competition_name + \
    ' --csv ' + ' > /tmp/submissions.csv'
print(cmd)

# invoke the Kaggle API CLI using a system call
returned_value = os.system(cmd) 

print('Returned value:', returned_value)
assert(returned_value==0)

# re-import submissions from the temp CSV file to pandas Dataframe
recent_submissions_list_df = pd.read_csv('/tmp/submissions.csv')
recent_submissions_list_df