bclarkson-code / Tricycle

Autograd to GPT-2 completely from scratch
104 stars 7 forks source link

Multi-GPU support #70

Open bclarkson-code opened 2 months ago

bclarkson-code commented 2 months ago

Currently, only single-gpu computation is supported. Multi-gpu support with NCCL should be added.

bclarkson-code commented 1 month ago

We can start with simple data parallelism but I would also like to add ZeRO so we can scale to more GPU's in the future