Kaggle / kaggle-api

Official Kaggle API
Apache License 2.0
6.01k stars 1.06k forks source link

Incorrect assumption that downloaded dataset is encoded as UTF-8 #505

Open aprknight opened 9 months ago

aprknight commented 9 months ago

In rest.py we have this line:

But what if the data are not encoded as UTF-8 ?

yashu1wwww commented 1 month ago

To download Kaggle datasets into Google Drive:

First, generate an API key in your Kaggle settings. Then, open google collab and run-in

!pip install opendatasets

It will prompt you to enter your Kaggle username and API key. After that, replace the dataset URL with your desired Kaggle dataset URL, and in the new_folder_name parameter, provide the folder name.

import shutil import opendatasets as od import os

from google.colab import drive

Mount Google Drive

drive.mount('/content/drive')

Define the output directory in Google Drive where the dataset will be downloaded

output_dir = '/content/drive/MyDrive/Kaggle_Datasets'

Define the name of the new folder to be created inside the output directory

new_folder_name = 'Embryo_Classification'

Create the new folder if it doesn't exist

new_folder_path = os.path.join(output_dir, new_folder_name) os.makedirs(new_folder_path, exist_ok=True)

Define the Kaggle dataset URL

dataset_url = 'https://www.kaggle.com/datasets/gauravduttakiit/embryo-classification-efficientnet/data'

Download the dataset to the specified directory in Google Drive

od.download(dataset_url, data_dir=new_folder_path)

Move the downloaded zip file to the new folder

zip_file_name = 'embryo-classification-efficientnet.zip' zip_file_path = os.path.join(output_dir, zip_file_name) shutil.move(zip_file_path, new_folder_path)