huggingface / datasets

🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
https://huggingface.co/docs/datasets
Apache License 2.0
19.2k stars 2.68k forks source link

Feature request : add leave=True to dataset.map to enable tqdm nested bars (and whilst we're at it couldn't we get a way to access directly tqdm underneath?) #3061

Open BenoitDalFerro opened 3 years ago

BenoitDalFerro commented 3 years ago

A clear and concise description of what you want to happen.

It would be so nice to be able to nest HuggingFace Datasets.map() progress bars in the grander scheme of things and whilst we're at it why not other functions.

Describe alternatives you've considered

By the way is there not a way to directly interact with underlying tqdm module ? **kwargs-ish?

Additional context

Furthering tqdm integration #2374 and huggingface/transformers#11797 solutioned by huggingface/transformers#12226 provided with tqdm description as desc=

@sgugger @bhavitvyamalik

bhavitvyamalik commented 3 years ago

@lhoestq, @albertvillanova can we have **tqdm_kwargs in map? If there are any fields that are important to our tqdm (like iterable or unit), we can pop them before initialising the tqdm object so as to avoid duplicity.

lhoestq commented 3 years ago

Hi ! Sounds like a good idea :)

Also I think it would be better to have this as an actual parameters instead of kwargs to make it clearer