huggingface / accelerate

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
https://huggingface.co/docs/accelerate
Apache License 2.0
7.88k stars 961 forks source link

Can I load model once or dataset once and copy to subprocess? #3172

Open Hans-digit opened 2 weeks ago

Hans-digit commented 2 weeks ago

One of my colleague faced this problem.

he has a messy preprocess in init method of custom dataset class.

In this case, is there any method to preprocess once in only main process and copy this dataset class to other subprocess.

In the same way, is there method to generate model once and copy this model to other subprocess, only call init method of model once.

Thanks for your help.

Hans-digit commented 2 weeks ago

found answer for dataset class here

https://github.com/huggingface/accelerate/issues/3001

not found answer for model yet.