huggingface / datasets

🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
https://huggingface.co/docs/datasets
Apache License 2.0
19.28k stars 2.7k forks source link

`from_parquet` return type annotation #7202

Open saiden89 opened 1 month ago

saiden89 commented 1 month ago

Describe the bug

As already posted in https://github.com/microsoft/pylance-release/issues/6534, the correct type hinting fails when building a dataset using the from_parquet constructor. Their suggestion is to comprehensively annotate the method's return type to better align with the docstring information.

Steps to reproduce the bug

from datasets import Dataset

dataset = Dataset.from_parquet(path_or_paths="file")
dataset.map(lambda x: {"new": x["old"]}, batched=True)

Expected behavior

map is a valid, no error should be thrown.

Environment info