karpathy / ng-video-lecture

3.57k stars 930 forks source link

M1/M2 performance fix: use Apple MPS (metal performance shaders) if available #26

Open sghael opened 1 year ago

sghael commented 1 year ago

Use Apple MPS (metal performance shaders) to move work to M1/M2 GPUs when available. The MPS backend support is part of the PyTorch 1.12 official release.

Screenshot July 17 2023 17:59:38

karpathy commented 1 year ago

Will this code fail for older PyTorch versions?

sghael commented 1 year ago

Good call-out. The first version would trigger an exception on pre1.2 versions of PyTorch.

I've modified the code to gracefully degrade with older versions of PyTorch (prior to 1.2). Unfortunately, this adds 2 lines to the code 😄 .

PyTorch 2.x on M1 silicon uses mps:

❯ python
Python 3.10.12 | packaged by conda-forge | (main, Jun 23 2023, 22:41:52) [Clang 15.0.7 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.__version__
'2.0.0'
>>> device = ('cuda' if torch.cuda.is_available()
...           else 'mps' if hasattr(torch.backends, 'mps') and torch.backends.mps.is_available()
...           else 'cpu')
>>> print(device)
mps

PyTorch 1.1 on M1 silicon now reverts to cpu:

❯ python
Python 3.10.12 | packaged by conda-forge | (main, Jun 23 2023, 22:41:52) [Clang 15.0.7 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.__version__
'1.11.0.post2'
>>> device = ('cuda' if torch.cuda.is_available()
...           else 'mps' if hasattr(torch.backends, 'mps') and torch.backends.mps.is_available()
...           else 'cpu')
>>> print(device)
cpu

Running gpt.py with mps on Apple M1 Pro:

❯ /Users/sghael/mambaforge/envs/ng-video-lecture/bin/python /Users/sghael/Developer/ng-video-lecture/gpt.py
mps
10.788929 M parameters
step 0: train loss 4.2221, val loss 4.2306

step 500: train loss 1.7446, val loss 1.9065
step 1000: train loss 1.3895, val loss 1.5960
...
step 4999: train loss 0.8609, val loss 1.5705

ESCALUS:
Enough of his very proper time;
Death of the poor little.

FRODH:
I cannot weither to staught the benefit of you.

ESCALUS:
Desperate your belly: when he hath infiss your joints
Throw his most mountsey: therefore, to say I hear,
And take your habby friends. Whence there I live?
An obstance of your brother have so heard that
You hope that slaughteer'd his sister than marry?

DUKE VINCENTIO:
'Come on; what's that's the princton; but that
I may know in require out absent;
That will run and

Running bigram.py w/ mps on Apple M1 Pro:

❯ /Users/sghael/mambaforge/envs/ng-video-lecture/bin/python /Users/sghael/Developer/ng-video-lecture/bigram.py
mps
step 0: train loss 4.7305, val loss 4.7241
step 300: train loss 2.8110, val loss 2.8249
...
step 2700: train loss 2.4738, val loss 2.4911

Foasthaprse tize herst el
O u fZEie hy:

Hak, CORineg aggell thrr Masearor charnge?
Tyoucre thy, chouspo in mppry way avend oubur'er sickes bokecard dhiceny

He tw el fe oupise he, lbustselownthous;
I m w
T:
The at;
I m hofaruk mondrn itheland's oe, oghithet f, badogienthofBRI'sey &CleDWeer'dsureisold array n
ICoyockind m murs, in mamybalorenyongmyooe, d Vofetthindy st
Hefqu brveseay alsteanerm to, oupomp rede d pre h, gavitYOfrrerean apsts lathind my d erouerse IOLUED d ngKE hicerire.
II IS:
I
yihaoye commented 1 year ago

Tried the method, but seems like cause this issue https://github.com/karpathy/ng-video-lecture/issues/32 thanks for sharing anyway