mrdbourke / pytorch-deep-learning

Materials for the Learn PyTorch for Deep Learning: Zero to Mastery course.
https://learnpytorch.io
MIT License
11.07k stars 3.25k forks source link

Clarification for discussion of shared memory in `torch.from_numpy(ndarray)` and `torch.Tensor.numpy()` #926

Open pvelayudhan opened 6 months ago

pvelayudhan commented 6 months ago

Thank you for this wonderful resource!

I have noticed that in the "PyTorch and Numpy" video: https://youtu.be/Z_ikDlimN6A?si=fufjAATXrinMXGtu&t=13085 as well as in the online book: https://www.learnpytorch.io/00_pytorch_fundamentals/#pytorch-tensors-numpy

The provided explanations for torch.from_numpy(ndarray) and torch.Tensor.numpy() suggest that the inputs and outputs of these functions are not linked in memory.

The explanation provided is based on the example that the code below:

import torch
import numpy as np

array = np.arange(1.0, 8.0)
tensor = torch.from_numpy(array)

array = array + 1
array, tensor # array and tensor have different values

Will lead to array and tensor having different values, and that lack of a connection occurs at the moment of calling tensor = torch.from_numpy(array).

However it's my understanding that the actual moment of these two variables pointing to different memory addresses is during the addition line, array = array + 1. You can see that both array and tensor are still tied together before this addition in this example:

import torch
import numpy as np

array = np.arange(1.0, 8.0)
tensor = torch.from_numpy(array)

array[0] = 5.0
array, tensor # array and tensor now both start with 5.0

This is topic is clarified better here: https://stackoverflow.com/questions/61526297/pytorch-memory-model-how-does-torch-from-numpy-work

Sorry if I am misunderstanding and thanks again for making all this wonderful content!

Luismbpr commented 6 months ago

Hello. Just wanted to comment this out, since it might also help a little bit to understand of the different DataTypes that they output when using torch.from_numpy() versus torch.Tensor()

Source: PyTorch memory model: "torch.from_numpy()" vs "torch.Tensor()"

""" from_numpy() automatically inherits input array dtype. On the other hand, torch.Tensor is an alias for torch.FloatTensor.

Therefore, if you pass int64 array to torch.Tensor, output tensor is float tensor and they wouldn't share the storage. torch.from_numpy gives you torch.LongTensor as expected.

a = np.arange(10)
ft = torch.Tensor(a)  # same as torch.FloatTensor
it = torch.from_numpy(a)

a.dtype  # == dtype('int64')
ft.dtype  # == torch.float32
it.dtype  # == torch.int64

answered Sep 18, 2018 by Viacheslav Kroilov

"""

pvelayudhan commented 6 months ago

Just found the corrections page for the video here where this point has already been acknowledged: https://github.com/mrdbourke/pytorch-deep-learning/discussions/98

But leaving this open for now as the book description (code lines 65-67 https://www.learnpytorch.io/00_pytorch_fundamentals/#pytorch-tensors-numpy) is still a little misleading.

Luismbpr commented 6 months ago

Just found the corrections page for the video here where this point has already been acknowledged: #98

But leaving this open for now as the book description (code lines 65-67 https://www.learnpytorch.io/00_pytorch_fundamentals/#pytorch-tensors-numpy) is still a little misleading.

Thanks for pointing that out. Good idea.

SailSabnis commented 4 months ago

It seems with the latest pyTorch - The Tensor inherits the datatype of the NumPy array - This is what I see -

Screenshot 2024-07-17 at 8 23 47 AM
pritesh2000 commented 3 months ago

Thank you for this wonderful resource!

I have noticed that in the "PyTorch and Numpy" video: https://youtu.be/Z_ikDlimN6A?si=fufjAATXrinMXGtu&t=13085 as well as in the online book: https://www.learnpytorch.io/00_pytorch_fundamentals/#pytorch-tensors-numpy

The provided explanations for torch.from_numpy(ndarray) and torch.Tensor.numpy() suggest that the inputs and outputs of these functions are not linked in memory.

The explanation provided is based on the example that the code below:

import torch
import numpy as np

array = np.arange(1.0, 8.0)
tensor = torch.from_numpy(array)

array = array + 1
array, tensor # array and tensor have different values

Will lead to array and tensor having different values, and that lack of a connection occurs at the moment of calling tensor = torch.from_numpy(array).

However it's my understanding that the actual moment of these two variables pointing to different memory addresses is during the addition line, array = array + 1. You can see that both array and tensor are still tied together before this addition in this example:

import torch
import numpy as np

array = np.arange(1.0, 8.0)
tensor = torch.from_numpy(array)

array[0] = 5.0
array, tensor # array and tensor now both start with 5.0

This is topic is clarified better here: https://stackoverflow.com/questions/61526297/pytorch-memory-model-how-does-torch-from-numpy-work

Sorry if I am misunderstanding and thanks again for making all this wonderful content!

I tried the code, and it works as you describe. What I conclude base on the experiment is, when you assigned array[0] = 5.0 both array and tensor were pointing towards same memory location. But when you assign a new variable array by array = array+1 the new memory is allocated to array. In array[0] = 5.0 you just changed value of one of the value of array but array = array + 1 is same as array2 = array + 1, it doesn't use previously allocated memory.

Hope you understand it, if not feel free to ask.

mrdbourke commented 2 months ago

Just found the corrections page for the video here where this point has already been acknowledged: #98

But leaving this open for now as the book description (code lines 65-67 https://www.learnpytorch.io/00_pytorch_fundamentals/#pytorch-tensors-numpy) is still a little misleading.

Thank you for this!

I've noted this down and will work on a clearer explanation + fix in the notebooks shortly.