🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
19.28k
stars
2.7k
forks
source link
Concurrent loading in `load_from_disk` - `num_proc` as a param #7286
Closed
unography closed 1 week ago
Feature request
https://github.com/huggingface/datasets/pull/6464 mentions a
num_proc
param while loading dataset from disk, but can't find that in the documentation and code anywhereMotivation
Make loading large datasets from disk faster
Your contribution
Happy to contribute if given pointers