bug?: m vs model - Githubissues

karpathy / ng-video-lecture

3.57k stars 930 forks source link

bug?: m vs model #18

Open freestylerick opened 1 year ago

freestylerick commented 1 year ago

I'm curious if gpt.py is buggy (which is my guess and gpt-4's guess as well https://chat.openai.com/share/b50316a4-0f63-4813-8888-9cb3ca68b7f1) or why it's not.

On line 199 of gpt.py we define m. We then use m and model when I think we should be using only one of them (I think 199 should be model = ... and then we only use model. There might be some spots where we have to move tensors so everything is on the same device).

InfiniteRandomVariable commented 1 year ago

I agree with your point, especially in some cases. However, this is an issue of coding style preference. Please close the issue.

freestylerick commented 1 year ago

To be clear - I am asking if this is a bug, not simply a question of style. Are you sure it's not a bug?

lincolnq commented 1 year ago

I wondered this also. The documentation for to (https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.to) has a note that says "This method modifies the module in-place" so I believe this bug doesn't impact the behavior of the code.

freestylerick commented 1 year ago

Thanks - I think this would benefit from a unit test.

bluenote10 commented 9 months ago

I am asking if this is a bug, not simply a question of style. Are you sure it's not a bug?

You could verify that it is not really a bug (but a bit sloppy) by something like this:

In [1]: import torch

In [2]: model = torch.nn.Linear(2, 1)

In [3]: model.weight.device
Out[3]: device(type='cpu')

In [4]: m = model.to("cuda")

In [5]: model.weight.device
Out[5]: device(type='cuda', index=0)

In [6]: id(model)
Out[6]: 140022542389776

In [7]: id(m)
Out[7]: 140022542389776

In [8]: model is m
Out[8]: True

freestylerick commented 8 months ago

In a perfect world, I'd also like to test that a change on m also propagates to model (and/or vise versa). If someone does this, then I will agree that they are very likely the same.