Open ricardoV94 opened 3 weeks ago
It does seem to have more overhead than I'd like, but I don't think this sounds terrible for real functions.
This microbenchmark slows down from 8ms to 11ms on my machine:
%%timeit
# Explicit device
a = torch.zeros(3, device="cuda")
for _ in range(1000):
a = a + 1
out = a.cpu().numpy()
%%timeit
# Using default device
with torch.device("cuda"):
a = torch.zeros(3)
for _ in range(1000):
a = a + 1
out = a.cpu().numpy()
Description
The easiest way to give PyTensor users global (not fine-grained) control over CPU/GPU for the PyTorch backend would be the
set_default_device
/with_device
. However, this may bee too slow, according to: https://github.com/pytorch/pytorch/issues/92701We should benchmark to see if it is a problem. If yes, we may want to use a PyTensor config flag to get the same control without the PyTorch overhead.