KLDivLoss function in pytorch always fallsback to CPU

jgtong commented 3 weeks ago

Greetings

We are using the function KLDivLoss from pytorch and we want to use this on Intel's GPU Max 1550 GPU. Unfortunately, it keeps falling back to run the function on the CPU.

The warning message that I keep seeing is

/home/jaytong/intel-xpu-backend-for-triton/.venv/lib/python3.10/site-packages/torch/nn/functional.py:3391: UserWarning: Aten Op fallback from XPU to CPU happends. This may have performance implications. If need debug the fallback ops please set environment variable `PYTORCH_DEBUG_XPU_FALLBACK=1`

Here is the reproducer code:

import torch

kl_div = torch.nn.KLDivLoss(reduction="batchmean")
input = torch.randn(1024, 1024, requires_grad=True, device="xpu").log_softmax(dim=-1)
target = torch.randn(1024, 1024, requires_grad=True, device="xpu").softmax(dim=-1)
print(f'{kl_div(input,target)=}')

My environment is the following:

OS: Ubuntu 22.04.5
pytorch-gpu-dev: 0.5.1
intel-xpu-backend-for-triton top hash: 4c9df48f62c9313428bdd2a8f9ebf0686bd21163
agama-950.13

I also tried with different tensor sizes with no success. Is this function not supported on GPU?

alexbaden commented 3 weeks ago

I think you want to file this in https://github.com/intel/torch-xpu-ops. That warning message is coming from PyTorch, not Triton.

jgtong commented 3 weeks ago

Thank you @alexbaden for your prompt response

intel / intel-xpu-backend-for-triton

KLDivLoss function in pytorch always fallsback to CPU #2537