practical-tutorials / project-based-learning

Curated list of project-based tutorials
MIT License
202.93k stars 26.48k forks source link

Add a tutorial for distributed training #359

Closed Sze-qq closed 2 years ago

Sze-qq commented 2 years ago

Description

I'd love to share a tutorial, from which we can learn how to train our AI model in a distributed manner step by step.

Motivation and Context

As the deep learning model are getting large, we might feel it hard to train it in a non-distributed manner. This tutorial clearly illustrates to me that

It is quite interesting and helpful for a deep learning researcher.

How Has This Been Tested?

There exists many popular parallelization techniques when it comes to distributed training. But I fail to have a general idea of them. This tutorial gives me a clear view of these advanced parallelization techniques and I can apply all of them on my code after I read this tutorial. I feel I can write the distributed deep learning models just like how I write the model on my laptop. Most importantly, it can greatly save my training time. Hence, I recommend to all of you who need to train deep learning models.

Types of changes

Checklist:

enkeyz commented 2 years ago

I can only accept complete projects. If for example you make a tutorial about a complete project with this AI, I'll gladly merge.