Kaggle / kaggle-api

Official Kaggle API
Apache License 2.0
6.02k stars 1.06k forks source link

Error listing dataset files #370

Open rgharris opened 2 years ago

rgharris commented 2 years ago

This dataset with only directories in its root (there are files in the directories) returns an error when attempting to list the files:

> kaggle datasets files jonathanbesomi/rsna-miccai-png
max() arg is an empty sequence

The CSV display option does correctly handle no files. Though the dataset does indeed have files in it.

> kaggle datasets files jonathanbesomi/rsna-miccai-png --csv
name,size,creationDate

Reproducing in Python

from kaggle.api.kaggle_api_extended import KaggleApi

api = KaggleApi()
api.authenticate()

resp = api.process_response(api.datasets_list_files(
    owner_slug="jonathanbesomi", dataset_slug="rsna-miccai-png"))
print(resp)
python ./test.py
{'datasetFiles': [], 'errorMessage': None}

System info

> kaggle -v
Kaggle API 1.5.12

> python --version
Python 3.9.0
jcsagar commented 2 years ago

So is there a workaround for this issue, or some other way to list files?

par1hsharma commented 2 years ago

I am facing the same issue . I want to download only train dataset from https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia but unable to do so and getting the same exact error .

shirley-wu commented 1 year ago

Same issue here, still not addressed. It seems kaggle files list will not list files under the sub-directories, which will lead to less or even no files listed by the api. Similar issue: https://github.com/Kaggle/kaggle-api/issues/386