Open nunoplopes opened 2 years ago
For example, when running zeros(), underneath PyTorch first creates a new tensor and then calls zero_. So we end up with a trace like:
zeros()
zero_
%0 = <Float> zero_ in<0> #refs E/I=1/2 #output shape=[1, 2] Inputs: in<0>: tensor(Float : [1, 2])
We now have a reference to the tensor that is returned by zeros.
zeros
Now let's look at the code in torch/csrc/autograd/generated/variable_factories.h:
inline at::Tensor zeros(at::IntArrayRef size, at::TensorOptions options = {}) { at::AutoDispatchBelowADInplaceOrView guard; return autograd::make_variable(at::zeros(size, at::TensorOptions(options).requires_grad(), /*requires_grad=*/options.requires_grad()); }
And now in torch/csrc/autograd/variable.h:
inline Variable make_variable(at::Tensor data, bool requires_grad = false, bool allow_tensor_metadata_change = true) { if (data.defined()) { if (data.getIntrusivePtr().use_count() == 1 && data.getIntrusivePtr()->unique_version()) { // reuse tensor } else { auto data_impl_copy = data.getIntrusivePtr()->shallow_copy_and_detach(...); return Variable(data_impl_copy); // <-- missing std::move here btw } } return Variable(); }
So now because we have that reference we force this function to create a copy of the tensor unnecessarily. Can this be fixed?
upstreamed patch: https://github.com/pytorch/pytorch/pull/67018
For example, when running
zeros()
, underneath PyTorch first creates a new tensor and then callszero_
. So we end up with a trace like:We now have a reference to the tensor that is returned by
zeros
.Now let's look at the code in torch/csrc/autograd/generated/variable_factories.h:
And now in torch/csrc/autograd/variable.h:
So now because we have that reference we force this function to create a copy of the tensor unnecessarily. Can this be fixed?