Puts val loss on cpu for np.mean()

CUNY-CL / yoyodyne

Small-vocabulary sequence-to-sequence generation with optional feature conditioning

Apache License 2.0

25 stars 15 forks source link

Puts val loss on cpu for np.mean() #187

Closed Adamits closed 1 month ago

Adamits commented 1 month ago

I guess I had not tested the final evaluators update on gpu. Since we now get the mean val loss with numpy.mean, we need to be sure the loss tensors are on cpu.

kylebgorman commented 1 month ago

What if we just use PyTorch's mean? Does that solve the problem?

Two interfaces for it:

https://pytorch.org/docs/stable/generated/torch.mean.html https://pytorch.org/docs/stable/generated/torch.Tensor.mean.html

Adamits commented 1 month ago

What if we just use PyTorch's mean? Does that solve the problem?

Two interfaces for it:

https://pytorch.org/docs/stable/generated/torch.mean.html https://pytorch.org/docs/stable/generated/torch.Tensor.mean.html

Good point, thats a better solution.

Adamits commented 1 month ago

Actually, the issue is that we end up with a list, not a tensor, and I think the overhead of turning the list into a tensor first is probably > the overhead of moving the values to CPU. I can test though.

kylebgorman commented 1 month ago

Actually, the issue is that we end up with a list, not a tensor, and I think the overhead of turning the list into a tensor first is probably > the overhead of moving the values to CPU. I can test though.

yeah whatever's faster (or if they're the same pick the one that's more elegant...)

Adamits commented 1 month ago

torch is a bit faster in my exps. I wonder if the fact that the scalars are already torch tensor types speeds it up?

kylebgorman commented 1 month ago

Shall I merge?

Adamits commented 1 month ago

Yes!