sagemathinc / cocalc

CoCalc: Collaborative Calculation in the Cloud
https://CoCalc.com
Other
1.17k stars 216 forks source link

support very easily importing any kaggle dataset #8038

Open williamstein opened 8 hours ago

williamstein commented 8 hours ago

Kaggle has datasets like this -- https://www.kaggle.com/datasets/pranavchandane/scut-fbp5500-v2-facial-beauty-scores

However, they do NOT make them trivial or easy to just import in a generic way, probably due to abuse, wanting to track users, throttle, etc. There's no direct wget-able download link, as far as I can tell.

However, there is a standard api client for kaggle and a user with an account on kaggle can use that command line tool to import a dataset via the linux terminal. We should make that much more easier, tested, and supported.

Using +New --> Linux Terminal, then use the terminal as explained in the top answer here to import your data?

https://stackoverflow.com/questions/45261190/how-to-get-kaggle-competition-data-via-command-line-on-virtual-machine

The kaggle api they mention is documented here:

https://github.com/Kaggle/kaggle-api#download-competition-files